From the CORGIS Dataset Project
By Austin Cory Bart acbart@vt.edu
Version 2.0.0, created 11/1/2015
Tags: publishers, amazon, books, sales, genres, literature, english
From a newspaper article about analyzing amazon e-book sales by genre and publisher. Unfortunately, they do not have information on the book’s title or author. This collection includes 54,000 titles spanning across several genres and types of publishing companies, practically every book on every Amazon bestseller list. Along with publisher information, it also includes the book’s overall Amazon Kindle store sales ranking. This ranking is used to sort the books. Keep in mind that this data is NOT time-oriented; it is a collection of a bunch of different books, not a book over time.
http://authorearnings.com/report/september-2015-author-earnings-report/
Each row represents $MISSING_FIELD.
Index | Type | Example Value |
---|---|---|
0 | dict | { } |
1 | dict | (same structure) |
2 | dict | (same structure) |
... | ... | ... |
Key | Type | Example Value | Description |
---|---|---|---|
"genre" |
str |
"genre fiction"
|
$MISSING_FIELD |
"sold by" |
str |
"HarperCollins Publishers"
|
$MISSING_FIELD |
"daily average" |
dict | { } | |
"publisher" |
dict | { } | |
"statistics" |
dict | { } |
Key | Type | Example Value | Description |
---|---|---|---|
"amazon revenue" |
float |
6832.0
|
$MISSING_FIELD |
"author revenue" |
float |
6832.0
|
$MISSING_FIELD |
"gross sales" |
float |
34160.0
|
$MISSING_FIELD |
"publisher revenue" |
float |
20496.0
|
$MISSING_FIELD |
"units sold" |
int |
7000
|
$MISSING_FIELD |
Key | Type | Example Value | Description |
---|---|---|---|
"name" |
str |
"Katherine Tegen Books"
|
$MISSING_FIELD |
"type" |
str |
"big five"
|
$MISSING_FIELD |
Key | Type | Example Value | Description |
---|---|---|---|
"average rating" |
float |
4.57
|
$MISSING_FIELD |
"sale price" |
float |
4.88
|
$MISSING_FIELD |
"sales rank" |
int |
1
|
$MISSING_FIELD |
"total reviews" |
int |
9604
|
$MISSING_FIELD |
Download all of the following files.
import publishers
book = publishers.get_book()