Tasks

We have formulated a few tasks meant to help you get started, but once you are up and running, it is perfectly fine to join up with other participants and follow your own path to the finish line.

Star chart 1708

Photo: Lars Kjær

Welcome to The Royal Danish Library’s Data Sprint, where we delve into the fascinating world of digital cultural heritage data. Through a series of hands-on tasks, you will gain valuable insights into data cleaning, analysis, and visualization. Below are the tasks you will be working on during the workshop:

Task 1:

Get to Know Your Data with OpenRefine

Objective: Learn the basics of OpenRefine for data clearning.

Data Cleaning Basics:

  • Remove white spaces.
  • Remove numbers from the 'place' column (decide whether these numbers should be deleted or preserved).

Text Facets and Clustering:

  • Experiment with different clustering algorithms on the 'author' and/or 'place' columns.

Working with Subsets:

  • Work with a subset of the dataset for faster processing.

Creating New Columns:

  • Create new columns based on existing ones. For example, preserve the original column and save the result from a "cluster" in a new column.

Timeline Creation:

  • Create a timeline using the 'year' column (ensure the dataset is cleaned first).
  • Compare years and different ways of writing city names using the sort function.

Author Analysis:

  • Choose an author and investigate changes in themes/topics/books they have written over time.

Task 2:

Reflect on Your Choices

Objective: Reflect on the short-term and long-term implications of your methodological decisions in data cleaning.

Reflection Exercise:

  • Consider the impact of standardizing city names and other data cleaning choices on your analysis.

Task 3:

Create Tidy Data and Export

Objective: Clean and export your dataset for further analysis and visualization.

Data Cleaning and Export:

  • Clean and export the dataset so it can be analyzed and visualized using Python, R, or Orange.
  • Enrich your data with geographical coordinates using OpenStreetMap.
  • Visualize the locations where books have been published on a map.

Task 4:

Danish Book History - What Do Titles Reveal?

Objective: Analyze book titles to uncover trends and patterns in Danish book history.

Title Analysis:

  • Examine titles for reprints; identify which titles/books have been popular enough to be reprinted multiple times throughout Danish history.
  • Compare the length of typical titles across different historical periods.
  • Investigate different spellings associated with various historical periods or themes (e.g., reyse, reise, or rejse). Determine when the modern spelling 'rejse' began to be used.

By participating in these tasks, you will not only enhance your data handling skills but also gain a deeper understanding of how digital tools can be used to explore and interpret cultural heritage data. We look forward to seeing the insights you uncover and the skills you develop during this workshop!