Yanchen Lu

February 21, 2023

Concept Summary

This project analyzes the check-out and check-in data for books related to fiber and textile crafts. I looked into 4 keywords primarily, "Crochet", one of its sub-form "Amigurumi", "Macrame", and its sub-form "Friendship Bracelet". Then, I queried for book records with titles containing these keywords, and measured the length of time each book spent outside the Seattle Public Library. Additionally, I aggregated the number of check-outs and check-ins that occurred for each of the four categories of books on each day to produce frequency datasets.

The queried data contains records from 2005 until the first two weeks of February 2023. The year-month-day timestamp data maps each record to a point in the 3D space, with days of a month organized along the x-axis, and months and years organized along the z-axis, with the height of each curve (along -y) determined by the number of days each book spent outside the SPL. For the aggregated frequency data, I visualized the number of check-outs and check-ins on each day as spheres, with the size of each sphere corresponding to check-out and check-in counts.

The GUI allows the user to navigate the 3D space and includes toggles to show or hide components of the visualization, revealing patterns in the dataset.

MySQL Query + Data PreProcessing

I first explored check-out and check-in records of books with keywords “Crochet”, “Amigurumi”, “Macrame”, and “Friendship Bracelet”.

SELECT *
FROM (
    SELECT DISTINCT(barcode), itemNumber, title, cout, cin, deweyClass, bibNumber, collcode, callNumber
    FROM spl_2016.inraw
    WHERE title LIKE '%macrame%'
    OR title LIKE '%friendship bracelet%'
    OR title LIKE '%crochet%'
    OR title LIKE '%amigurumi%'
) AS macrame_books
ORDER BY itemNumber, barcode, cout;

Screenshot of dataset from the 1st query

Screenshot of dataset from the 1st query

In the second query, I counted the number of days between check-out and check-in dates for each record, as “time spent outside the SPL”, and added a column to denote the artform category.

SELECT *
FROM
	(SELECT
		itemNumber, title, cout, cin, deweyClass,
		DATEDIFF(cin, cout) AS time_outside_lib,
		(CASE
			WHEN (title LIKE '%macrame%') THEN 'macrame'
			WHEN (title LIKE '%friendship bracelet%') THEN 'friendship bracelet'
			WHEN (title LIKE '%crochet%') THEN 'crochet'
			WHEN (title LIKE '%amigurumi%') THEN 'amigurumi'
		END) AS artform
	FROM
		spl_2016.inraw
	WHERE
		title LIKE '%macrame%'
		OR title LIKE '%friendship bracelet%'
		OR title LIKE '%crochet%'
		OR title LIKE '%amigurumi%') AS textile_art
ORDER BY artform, itemNumber, cout;

Screenshot of dataset from the 2nd query, added time_outside_lib and artform columns

Screenshot of dataset from the 2nd query, added time_outside_lib and artform columns

There are only two irrelevant books in the resulting data from the query, “Dead men dont crochet” and “un friendship bracelet”. I removed them from the dataset.

I noticed that there are a lot of duplicate check-out dates that match to different check-in dates, even if it’s the same copy of a book. It might be due to readers renewing books repeatedly before finishing and returning the book to the library. Therefore, I chose to keep the last check-in date for each check-out record of each book copy. The dataset is in data_nodup_last.csv.

Dataset after removing duplicate check-out dates

Dataset after removing duplicate check-out dates

Lastly, I aggregated check-out and check-in counts for each date within the dataset. This produced two more datasets, aggregated_cout.csv with check-out counts, and aggregated_cin.csv with check-in counts.

Ideation+Sketches

I started out with the idea of a calendar page. The design evolved from an opened book, to a cylinder consisted of 12 planes, then to a flat disc that represents each year. I sketched out the geometry in processing. It formed a spindle like structure. However, I didn’t really like the shape.

Spindle geometry

Spindle geometry

So, I decided to go back to my initial calendar idea. With each line presenting a year and the progression through the decades, it forms a 2D plane that maps each day from 2005/1/1 to 2023/2/14. The duration of a check-out → check-in record would determine the height of its curve.

Each year as a line, from 2005 to 2023

Each year as a line, from 2005 to 2023

The geometry I ended up going with, each year as a rectangle

The geometry I ended up going with, each year as a rectangle

Since each horizontal line represents dates increasing to the right, though the right most dates may actually be closer in time with the left most dates on the neighboring line, the length the of the curve connecting those dates would be disproportionately long. Therefore, I thought using a “snake grid”, altering the direction or order of the dates on horizontal lines might facilitate a more accurate visual representation.

sketch.pdf

The Visualization

3D Space