publishers icon

Publishers Python Library

From the CORGIS Dataset Project

By Austin Cory Bart acbart@vt.edu
Version 2.0.0, created 11/1/2015
Tags: publishers, amazon, books, sales, genres, literature, english

Overview

From a newspaper article about analyzing amazon e-book sales by genre and publisher. Unfortunately, they do not have information on the book’s title or author. This collection includes 54,000 titles spanning across several genres and types of publishing companies, practically every book on every Amazon bestseller list. Along with publisher information, it also includes the book’s overall Amazon Kindle store sales ranking. This ranking is used to sort the books. Keep in mind that this data is NOT time-oriented; it is a collection of a bunch of different books, not a book over time.

http://authorearnings.com/report/september-2015-author-earnings-report/

Explore Structure

Each row represents $MISSING_FIELD.

Index Type Example Value
0 dict { }
1 dict (same structure)
2 dict (same structure)
... ... ...
Key Type Example Value Description
"genre" str "genre fiction" $MISSING_FIELD
"sold by" str "HarperCollins Publishers" $MISSING_FIELD
"daily average" dict { }
"publisher" dict { }
"statistics" dict { }
Key Type Example Value Description
"amazon revenue" float 6832.0 $MISSING_FIELD
"author revenue" float 6832.0 $MISSING_FIELD
"gross sales" float 34160.0 $MISSING_FIELD
"publisher revenue" float 20496.0 $MISSING_FIELD
"units sold" int 7000 $MISSING_FIELD
Key Type Example Value Description
"name" str "Katherine Tegen Books" $MISSING_FIELD
"type" str "big five" $MISSING_FIELD
Key Type Example Value Description
"average rating" float 4.57 $MISSING_FIELD
"sale price" float 4.88 $MISSING_FIELD
"sales rank" int 1 $MISSING_FIELD
"total reviews" int 9604 $MISSING_FIELD

Downloads

Download all of the following files.

  1. publishers.py
  2. publishers.data

Usage

import publishers
book = publishers.get_book()

Documentation

get_book()
Returns a list of dictionaries representing book.