Chapter 6 Data Sources
6.1 Internal Sources
The following data sources have been consolidated into our internal database, ojodb.
The data is collected via a web scraper maintained by Asemio and hosted on the Google Cloud Platform. It is scraped from various web and pdf sources, detailed below. For further information on the data collection specification, refer to the documentation by Asemio. For more information on maintenance procedures, see here.
6.1.1 OSCN: Oklahoma District Court Records
OSCN sourced data can be found in the "public"
schema of the database. For example, you can list all tables in the schema with ojo_list_tables(schema = "public")
. The Oklahoma State Courts Network (OSCN) holds information on all types of criminal and civil cases filed in District Courts across Oklahoma. The format and content of court records available on OSCN differ according to the records management system used by the county.
We divide counties into two categories: OSCN counties and ODCR counties. For 13 counties, including the 6 largest by population, the information available is extensive and structured consistently. The relative ease of using data collected from OSCN allows us to perform more reliable and granular analysis. We often refer to these 13 counties as OSCN counties.
We refer to the other 64 counties as ODCR counties. The records for these counties contain less detail, so getting data on bonds, dispositions, and other important aspects of a case takes more guesswork and gives more uncertain results. Note also that two ODCR counties have more than one court: Creek, which has courts in Bristow, Drumright, and Sapulpa, and Okmulgee, which has courts in Okmulgee and Henryetta.
district | source |
---|---|
ADAIR | OSCN |
ALFALFA | ODCR |
ATOKA | ODCR |
BEAVER | ODCR |
BECKHAM | ODCR |
BLAINE | ODCR |
BRISTOW | ODCR |
BRYAN | ODCR |
CADDO | ODCR |
CANADIAN | OSCN |
CARTER | ODCR |
CHEROKEE | ODCR |
CHOCTAW | ODCR |
CIMARRON | ODCR |
CLEVELAND | OSCN |
COAL | ODCR |
COMANCHE | OSCN |
COTTON | ODCR |
CRAIG | ODCR |
CREEK | ODCR |
CUSTER | ODCR |
DELAWARE | ODCR |
DEWEY | ODCR |
DRUMWRIGHT | ODCR |
ELLIS | OSCN |
GARFIELD | OSCN |
GARVIN | ODCR |
GRADY | ODCR |
GRANT | ODCR |
GREER | ODCR |
HARMON | ODCR |
HARPER | ODCR |
HASKELL | ODCR |
HENRYETTA | ODCR |
HUGHES | ODCR |
JACKSON | ODCR |
JEFFERSON | ODCR |
JOHNSTON | ODCR |
KAY | ODCR |
KINGFISHER | ODCR |
KIOWA | ODCR |
LATIMER | ODCR |
LEFLORE | ODCR |
LINCOLN | ODCR |
LOGAN | OSCN |
LOVE | ODCR |
MAJOR | ODCR |
MARSHALL | ODCR |
MAYES | ODCR |
MCCLAIN | ODCR |
MCCURTAIN | ODCR |
MCINTOSH | ODCR |
MURRAY | ODCR |
MUSKOGEE | ODCR |
NOBLE | ODCR |
NOWATA | ODCR |
OKFUSKEE | ODCR |
OKLAHOMA | OSCN |
OKMULGEE | ODCR |
OSAGE | ODCR |
OTTAWA | ODCR |
PAWNEE | ODCR |
PAYNE | OSCN |
PITTSBURG | ODCR |
PONTOTOC | ODCR |
POTTAWATOMIE | ODCR |
PUSHMATAHA | OSCN |
ROGERMILLS | OSCN |
ROGERS | OSCN |
SEMINOLE | ODCR |
SEQUOYAH | ODCR |
STEPHENS | ODCR |
TEXAS | ODCR |
TILLMAN | ODCR |
TULSA | OSCN |
WAGONER | ODCR |
WASHINGTON | ODCR |
WASHITA | ODCR |
WOODS | ODCR |
WOODWARD | ODCR |
6.1.1.1 Uses
OSCN data is the most common resource for OJO projects about the justice system. There is an abundance of information on criminal and civil cases, including parties, criminal charges and civil case issues, case resolutions (called dispositions), court appearances, fines and fees, etc. Some questions we might answer using OSCN records include:
- What are the most common charges in criminal cases?
- How much in fines and fees are levied against people in criminal cases? How much is collected?
- What percentage of criminal felony cases are dismissed?
- How many evictions are filed and granted each year?
There are millions of cases stored in our database, so the possibilities for research are infinite, but it takes an enormous amount of effort and lots of subject matter expertise to ensure that we are drawing valid conclusions. The [Research Methodology] section contains details on how we do that.
6.1.1.2 Tables
6.1.1.2.1 case
The case
table contains basic information about the case as well as the IDs of data associated with the case that is stored in other tables.
Table | Variable | Description |
---|---|---|
case | id | Case ID number |
case | title | Title of the case (e.g., State of Oklahoma v. Roman Roy) |
case | district | Name of district court |
case | case_type | Abbreviation of case type (e.g., “CF” for felony) |
case | year | Year of case filing |
case | case_number | Case number assigned by court, consisting of case type abbreviation, year of filing, and number in filing sequence (e.g. CF-2021-1234) |
case | date_filed | Date of case filing |
case | date_closed | Date of case close (not always updated at the end of a case) |
case | status | Current status of case |
case | judge | Judge assigned to case. In many cases, this is not an individual’s name but the docket the case appears on. |
case | appealed_from | This field contains no data as of 2021-11-30 |
case | attorneys | Nested list of attorney IDs associated with parties to the case |
case | parties | Nested list of party IDs associated with the case |
case | events | Nested list of event IDs associated with the case |
case | citation_information | Nested list of citation IDs associated with the case |
case | minutes | Nested list of minute IDs associated with the case |
case | counts | For criminal cases, nested list of count IDs associated with the case (OSCN counties) |
case | issues | For civil cases, nested list of IDs associated with the case |
case | created_at | Date and time that case ifnformation was first collected |
case | updated_at | Date and time that case information was last collected |
case | open_counts | Nested list of charge descriptions associated with the case (ODCR counties only) |
6.1.1.2.2 case_type
The case_type
table contains a lookup table matching case_type
abbreviations (as found in the case
table), with a corresponding text label, e.g. "CF" = "Criminal Felony"
.
6.1.1.2.3 party
The party
table contains the name and role of persons or organizations involved in a case, e.g. Tulsa Public Schools
, Defendant
. There are normally multiple rows per case.
If the party of interest is an individual, there may be an associated ID in the person_record
column. Use this ID to link party
records to information in the person_record
table.
6.1.1.2.4 count
The count
table contains details on the reasons for a criminal case, i.e. the count(s) brought against a defendant. See the issue
table to obtain the reason(s) for civil case, i.e. issues. There can be multiple rows per defendant per case.
Not all counts remain the same throughout the criminal case. Some may be dropped or modified by the time the case is disposed. Compare the column count_as_filed
to the count_as_disposed
column. Use the disposition
and disposition_date
columns to determine the outcome of a count, and when it was disposed.
6.1.1.2.5 issue
The issue
table contains details on the reasons for a civil case. Unlike in the count
table, issues are contained in only the description
column and do not change throughout the case. Use the disposition
and disposition_date
columns to determine the outcome of an issue, and when it was disposed.
6.1.1.2.6 minute
Each case has a number of minutes associated with it. The minute
table stores the code
, description
, and other information for each record. If the minute has an associated fine or fee, you can use the amount
column to determine the cost.
6.1.2 ODOC: State Prison Records
ODOC sourced data can be found in the "odoc"
schema of the database. The Oklahoma Department of Corrections records data on each person who enters its system, including information on their offenses and sentence.
6.1.2.1 Uses
Some questions we might answer using ODOC records include: - How many people are being held in prisons around the state? - How many people in the prison system were sentenced for violent crimes? - What is the average sentence length for violent vs. non-violent sentences?
Linking individuals from ODOC records to those sourced from OSCN, we might ask: - Of the people released from DOC custody last year, how many have since been charged with a criminal count in Oklahoma?
6.1.2.2 Tables
6.1.3 OCDC: Oklahoma County Jail Records
The Oklahoma County Detention Center provides us access to their internal data via JailTracker, a jail management platform.
6.1.3.1 Uses
6.1.3.2 Tables
6.1.4 IIC: Tulsa County Jail Records
The source data can be viewed here.
6.1.4.1 Uses
6.1.4.2 Tables
6.2 External Sources
6.2.1 Census
In most cases, census data is best obtained using the {tidycensus} package. You will need to obtain an API key from the Census Bureau website to utilize the package. See the {tidycensus} documentation for complete instructions.
6.2.1.1 Population Estimates
When obtaining county level population estimates that can be compared year-to-year DO NOT use the Decennial or ACS census releases. Rather, DO use the Population Estimates provided by tidycensus::get_estimates(product = "population")
.
6.2.1.2 Decennial
6.2.1.3 ACS
6.2.2 Geography
In the {tidycensus} package, you can specify geometry = TRUE
to return shapefiles used for geographic plots. In ggplot, use the geom_sf()
to create such maps.