String Datasets

  • com.giphy:tag

    Popular hashtags on Giphy

  • com.imgflip:meme_text

    Text of memes

  • com.instagram:caption

    Best captions on Instagram

  • com.instagram:hashtag

    Popular hashtags on Instagram

  • com.linkedin:industry

    Industry field from Linkedin

  • com.phdcomics:title

    PhD Comics titles

  • com.reddit.frontpage:category


  • com.spotify:playlist

    Popular Spotify playlists

  • com.twitter:hashtag

    Popular hashtags from Twitter

  • com.twitter:username

    Twitter usernames

  • com.xkcd:alt_text

    XKCD comic alt text (hover text)

  • com.xkcd:title

    XKCD comic titles

  • com.xkcd:what_if_title

    XKCD's What If blog titles


    Titles of YouTube channels


    Titles of YouTube videos

  • gov.nasa:apod_title

    Titles of NASA's Astronomy Picture of the Day (historical archive)

  • tt:celestial_body

    Planets, moons and stars

  • tt:email_subject

    Email subjects

  • tt:job_title

    Job titles and professions

  • tt:location

    Cities, points of interest and addresses

  • tt:long_free_text

    General Text (paragraph)

  • tt:message

    Chat/SMS message

  • tt:movie_title

    Movie titles

  • tt:news_description

    Snippets from national news articles

  • tt:news_title

    Titles of national news articles

  • tt:path_name

    File and directory names

  • tt:person_first_name

    First names of people (from the US census and social security data)

  • tt:person_full_name

    Full names of people (from the US census and social security data)

  • tt:search_query

    Web search queries

  • tt:short_free_text

    General Text (short)

  • tt:song_album

    Names of popular music albums (weighted by popularity)

  • tt:song_artist

    Popular music artists

  • tt:song_name

    Names of songs (weighted by popularity)

  • tt:word

    Dictionary Words