Yanchen Lu

January 24, 2023

Concept Description

This project is an exploration of books and media regarding LGBTQ topics in the Seattle Public Library, how they are represented and categorized by the library, as well as how the public’s interest in them have changed throughout the years.

As a bisexual person growing up, I always felt that queer literature and media were limited or insufficiently represented compared to other subjects. Access to these resources is an important way for both members of the LGBTQ community and people outside of it to learn about, understand, and accept these marginalized identities and come to appreciate queer sub-cultures. Fortunately, as more countries decriminalized and/or legalize same-sex marriage, and as (western/US) culture shifts to become more accepting of the LGBTQ community, I have started to see more and more authors and creators advocate for queer representation in literature and media. Therefore, I set out to investigate what types of queer books/media can the public access from the Seattle Public Library, and how has the public’s interest in them has grown and shifted.

MySQL Queries

First, I need to figure out how to extract a more or less complete collection of books/media on queer topics from the library’s database. One way the topic or the subject of a book is reflected is through its title. My first query selects all books/media containing LGBTQ keywords in their titles, from the spl_2016.outraw database of checkout records.

The keywords include ‘lesbian’, ‘gay’, ‘bisexual’, ‘transgender’, ‘queer’. These keywords are part of the LGBTQ acronym, and so they should cover a good amount of relevant books/media. (The acronym can be extended to LGBTQIA+ to include more identities. I added more keywords in later queries correspondingly.) The keyword ‘homosexual’ is also included because it’s often used as an umbrella term to describe non-straight relationships.

However, this query has two main issues:

SELECT 
    DISTINCT(bibNumber), title, deweyClass
FROM
    spl_2016.outraw
WHERE
    title LIKE '%homosexual%'
	OR title LIKE '%lesbian%'
    OR title LIKE '%gay%'
    OR title LIKE '%bisexual%'
    OR title LIKE '%transgender%'
    OR title LIKE '%queer%'
ORDER BY bibNumber
LIMIT 1000;

Therefore, I decided to explore how the spl_2016.subject database can help me better capture and extract queer books/media.

I found that the same set of keywords is able to cover a wide range of categories that should help me extract a more complete collection. Joining the spl_2016.outraw database with the spl_2016.subject database on each row’s bibNumber column matches the book/media to the subjects it was assigned.

I explored 2 ways to join the tables, in anticipation of missing/mismatched records from 2 separate databases

to see how the coverage may change.

# spl_2016.outraw LEFT JOIN spl_2016.subject
# query result has 8000+ rows
SELECT 
    *
FROM(
	SELECT DISTINCT(bibNumber) AS distinct_bib, title, deweyClass, itemtype
    FROM
        spl_2016.outraw
    WHERE
        title LIKE '%homosexual%'
		OR title LIKE '%lesbian%'
		OR title LIKE '%gay%'
		OR title LIKE '%bisexual%'
		OR title LIKE '%transgender%'
		OR title LIKE '%queer%'
        OR title LIKE '%intersex%'
		OR title LIKE '%asexual%'
    ORDER BY bibNumber
) queer_titles
LEFT JOIN spl_2016.subject
ON distinct_bib = spl_2016.subject.bibNumber;
# spl_2016.subject LEFT JOIN spl_2016.outraw
# query result has around 14K rows
# though it does involved duplicates
SELECT *
FROM(
	SELECT DISTINCT(bibNumber) AS distinct_bib, subject
	FROM
		spl_2016.subject
	WHERE
		spl_2016.subject.subject LIKE '%homosexual%'
		OR spl_2016.subject.subject LIKE '%lesbian%'
		OR spl_2016.subject.subject LIKE '%gay%'
		OR spl_2016.subject.subject LIKE '%bisexual%'
		OR spl_2016.subject.subject LIKE '%transgender%'
		OR spl_2016.subject.subject LIKE '%queer%'
        OR spl_2016.subject.subject LIKE '%intersex%'
		OR spl_2016.subject.subject LIKE '%asexual%'
	ORDER BY bibNumber
) bib_subj
LEFT JOIN (
	SELECT DISTINCT(bibNumber) AS out_bib, title, itemtype, deweyClass
	FROM
		spl_2016.outraw
)outraw_titles
ON distinct_bib = out_bib;

The query that searched for matching keywords from subjects produced a more complete collection with fewer irrelevant entries upon manual inspection.

Next, I want to investigate how queer books/media are classified and identify interesting patterns.

I selected distinct titles that have a Dewey Classification, to see what these works are classified as. Unfortunately, there is a non-negligible amount of books/media not categorized in the Dewey Classes, which may cause the distribution to be skewed.

SELECT DISTINCT(title), deweyClass
FROM (
	SELECT *
	FROM(
		SELECT DISTINCT(bibNumber) AS distinct_bib, subject
		FROM
			spl_2016.subject
		WHERE
			spl_2016.subject.subject LIKE '%homosexual%'
			OR spl_2016.subject.subject LIKE '%lesbian%'
			OR spl_2016.subject.subject LIKE '%gay%'
			OR spl_2016.subject.subject LIKE '%bisexual%'
			OR spl_2016.subject.subject LIKE '%transgender%'
			OR spl_2016.subject.subject LIKE '%queer%'
      OR spl_2016.subject.subject LIKE '%intersex%'
			OR spl_2016.subject.subject LIKE '%asexual%'
		ORDER BY bibNumber
	) bib_subj
	INNER JOIN (
		SELECT title, itemtype, deweyClass, bibNumber as out_bib
		FROM
			spl_2016.outraw
		WHERE deweyClass != ''
	)outraw_titles
	ON distinct_bib = out_bib
)inner_join_table
ORDER BY deweyClass;