JSON (JavaScript Object Notation) is a data format that is designed for human readability and writeability.
Below is an example JSON file for college football teams from the 2022 season. I only included two conferences and three teams for space, but this helps illustrate some key advantages:
{
"conferences": [
{
"abbreviation": "ACC",
"id": 1,
"name": "Atlantic Coast Conference",
"founded": 1953,
"size": 14,
"hasDivisions": true,
"teams": [
{
"id": 1,
"name": "University of Virginia",
"abbreviation": "UVA",
"mascot": "Cavaliers",
"division": "ACC Coastal",
"record": {
"totalWins": 3,
"totalLosses": 7,
"confWins": 1,
"confLosses": 6,
"winPercent": 0.3
}
},
{
"id": 2,
"name": "Virginia Tech",
"abbreviation": "VT",
"mascot": "Hokies",
"division": "ACC Coastal",
"record": {
"totalWins": 3,
"totalLosses": 8,
"confWins": 1,
"confLosses": 6,
"winPercent": 0.272727272
}
}
]
},
{
"abbreviation": "Big XII",
"id": 2,
"name": "Big 12 Conference",
"founded": 1994,
"size": 14,
"hasDivisions": false,
"teams": [
{
"id": 3,
"name": "West Virginia University",
"abbreviation": "WVU",
"mascot": "Mountaineers",
"record": {
"totalWins": 5,
"totalLosses": 7,
"confWins": 3,
"confLosses": 6,
"winPercent": 0.416666667
}
}
]
}
]
}
The above structure allows for JSON to be:
While we aren’t discussing XML right now, be aware the XML and JSON (and YAML) all support these features, and are often used in the same way to transmit data. JSON is, to the best of my reckoning, the most popular format right now (as of 2023). XML was the standard for much of the 2000s. But while the syntax of each file is a little different, the actual representation of the data is basically the same: organizing data into a nested structure of lists/maps.
JSON supports the following datatypes:
1) integers - whole numbers - 5
2) floats - decimal numbers - 3.5
3) strings - text bounded by "quotation marks"
4) booleans - true
or false
5) lists - ordered data bounded by square brackets. For example, [8, 6, 7, 5, 3, 0, 9]
6) maps - key-value pairs bounded by curly braces where every key must be a string, but the value can be any datatype; For example: {"name": "Will McBurney", age:35}
You can, and almost always will, have things like a map having a list or map value, or your data being stored as a list of maps, etc.
Specifically, the above json example is structured as follows:
The outermost structure is a map with one key: “conferences”. This is a map because of the use of curly braces: { }
. The key “conferences maps to a list (denoted by square braces, [ ]
) of conference object.
Each conference object is itself a map with the keys: 1) “abbreviation”: The conference’s abbreviation (string) 2) “id”: an ID number unique to each conference (integer) 3) “name”: the full name of the conference (string) 4) “founded”: the year the conference was founded (integer) 5) “size”: The current size of the conference (for football members, in this case) (integer) 6) “hasDivisions”: A boolean value denoting whether the conference has explicit divisions (boolean) 7) “teams”: A list of team objects in the conference
Example of a conference object:
{
"abbreviation": "ACC",
"id": 1,
"name": "Atlantic Coast Conference",
"founded": 1953,
"size": 14,
"hasDivisions": true,
"teams": []
}
Each team object is a map which has the following keys: 1) “id” - a unique id for each team 2) “name” - the name of the team’s institution. 3) “abbreviation” - the common abbreviation of that team 4) “mascot” - the team’s nickname 5) “division” - the division that team is in (note that this may or may not be present depending on the conferences “hasDivisions” value) 6) “record” - an object (represented by map) that states that team’s win/loss record
Example of a team object
{
"id": 2,
"name": "Virgina Tech",
"abbreviation": "VT",
"mascot": "Hokies",
"division": "ACC Coastal",
"record": {
"totalWins": 3,
"totalLosses": 8,
"confWins": 1,
"confLosses": 6,
"winPercent": 0.272727272
}
}
Let’s say I knew I wanted to get the University of Virginia’s conference wins and losses from the above data. For the sake of this example, we’ll say I know ahead of time that UVA is in the ACC. Here is how I would step through the JSON data (note that we will show how to do this with Java in the next unit, for now we’re sticking to a language agnostic approach).
The code snippets shown are for a general “python like” language, and aren’t meant to be viewed as literal code, but as psuedocode
1) Starting with the overall JSON Object, get the list from the key conferences
: conferenceList = myJsonObject["conferences"]
2) From conferences, step through each conference one at a time until you find the conference that has ACC
for abbreviation, and get that index (in this case, index 0
): accConference = conferenceList[0]
3) Get the team list from the conference: accTeams = accConference["teams"]
4) Iterate through the list, getting the team whose name is “University of Virginia” (in this case, index 0
): uvaTeam = accTeams[0]
5) Get the record object with the key “record”: uvaRecord = uvaTeam["record"]
6) Get the total wins from the record object with key “totalWins”: totalWins = uvaRecord["totalWins"]
7) Get the total losses in the same way: totalLosses = uvaRecord["totalLosses"]
Regardless what language you use, parsing JSON is done in this fashion, from the outside-in. Keep this in mind as we show Java code for parsing JSON in the next unit.