The name is self-explanatory: Metadata is data about data. To better understand this, here’s an example: Say, you have recorded an hour interview with a woman about her job. When saving the audio file to your computer, you probably will give the file a name, something like “Jennifer - interview 3.mp3”. This is metadata. You might want to remember on which date you recorded the interview. This is metadata. Perhaps, you upload the interview to the internet, but you only want a certain group of people to have access to the file. The accessibility is also metadata.

One could say, that metadata answers general, easy questions about a resource. If you have an audio file of an interview, for example, the metadata could answer the following questions:  

  • “When was the resource created?”,
  • “What type of file is the resource?”
  • “Who has access to the resource?”
  • “What are the topics of interest in this resource?”
  • “For which research fields is this resource of interest?”

If you are creating your own data, like in the example above, it might be the case that you do not need that much metadata to remember what the files contain. The majority of the metadata is stored in your head. This is why metadating (yes, it is also a verb) is most relevant when depositing data into a large online archive, for others to use. The more metadata is added to resources, the better people can find the data and the easier it is for other researchers to decide if the data is relevant to them without deep-diving into the data itself.  

Dublin Core
It is important that metadata is understandable for anyone who wants to use the data. Different standards for metadata have been created for this purpose. Within these standards, different terms are defined. A well-known and widely used standard is Dublin Core. Examples of terms within Dublin Core are “created”, for the date when the resource was created, or “language”, for the language in which the resource at hand is. It is practical to use a well-known standard when metadating your own data, so that if your data is ever deposited into a larger archive, integration is quite seamless.