A simple tutorial of JQ

Show how JQ could be used to query a JSON with a lot of possibilities. For these examples I am using the output derived from a openQA API query.

openqa-cli api --host http://openqa.suse.de /jobs limit=5000 result="failed" latest=1 >output.failed.json

To know more about API query, please visit Cris’ tutorial

JQ is huge, so I am only explaining some useful examples derived from some tickets when I was working in openQA and tools team at SUSE. To know more about JQ please visit https://stedolan.github.io/jq/

Selection

Show as beauty JSON cat output.failed.json | jq ''

Output:

{
  "jobs": [
    {
      "group": "Maintenance: SLE 12 SP3 Updates",
      "group_id": 108,
      "id": 5895632,
      ...
      "settings":{
        "MACHINE": "64bits"
        ...
      },
      ...
      "t_finished": "2021-05-06T13:20:27",
      "t_started": "2021-05-06T13:12:41"
    },
    ...

Get all the ids as list cat output.failed.json | jq '.jobs[].id'

Output:

5895632
5895663
...

Note: This list is ready for be processed in a bash loop

Get all the ids as valid json array cat output.failed.json | jq '[.jobs[].id]'

Output:

[
  5895632,
  5895663,
  5895680,
  ...

Get the last id cat output.failed.json | jq '[.jobs[].id] | last'

Get the last job cat output.failed.json | jq '.jobs | last'

Get the first job cat output.failed.json | jq '.jobs | first'

Get the count of jobs cat output.failed.json | jq '.jobs | length'

Advanced selection

Get the id, name and the finish time for all the jobs

cat output.failed.json | jq '[.jobs[] | {id: .id, name: .name, finished: .t_finished}]'

Output:

[
  {
    "id": 5895632,
    "name": "sle-12-SP3-Server-DVD-HA-Updates-x86_64-Build20210426-1-qam_ha_rolling_update_node01@64bit",
    "finished": "2021-04-26T04:37:31"
  },

  ...

Get the id, name and the finish time in epoch seconds

cat output.failed.json | jq '[.jobs[] | {id: .id, name: .name, finish: (.t_finished+"Z" | fromdate)}]'

Output:

[
  {
    "id": 5895632,
    "name": "sle-12-SP3-Server-DVD-HA-Updates-x86_64-Build20210426-1-qam_ha_rolling_update_node01@64bit",
    "finish": 1619411851
  },
  ...

Include to the jobs the elapsed time (finished - started)

cat output.failed.json | jq '.jobs | first | .elapsed=((.t_finished+"Z"|fromdate) - (.t_started+"Z"|fromdate))'

Output:

  ...
    "elapsed": 3203
  }
]

Queries

Get all the jobs where the name contains SP3. And S2

cat output.failed.json | jq '[.jobs[] | select(.name | contains("SP3"))] | length'
cat output.failed.json | jq '[.jobs[] | select(.name | contains("SP2"))] | length'

Get all the jobs where the name contains SP2 or SP3

cat output.failed.json | jq '[.jobs[] | select((.name | contains("SP3")) or (.name | contains("SP2")))] | length'

Get all the jobs where the machine is 64bits Machine is a attribute inside the settings structure

cat output.failed.json | jq '[.jobs[] | select( .settings.MACHINE == "64bit" )] | length'

Same case than before (Get all the jobs where the machine is 64bits) using key/value structures*

cat output.failed.json | jq '[.jobs[] | select( .settings | to_entries[] | (.key =="MACHINE") and (.value == "64bit"))] | length'

Get all the jobs than have the issue 19151 as part of any settings that contains ISSUE

cat output.failed.json | jq '[.jobs[] | select( .settings | to_entries[] | (.key | contains("ISSUE") ) and (.value | contains("19151") ))] | length'

Sort by

Sort the elements by machine name

cat output.failed.json | jq '[.jobs[] | {id:.id, name: .name, machine: .settings.MACHINE }] | sort_by(.machine)'

Get the job that spend less time to finish

cat output.failed.json | jq '[.jobs[] | .elapsed=((.t_finished+"Z"|fromdate) - (.t_started+"Z"|fromdate))] | sort_by(.elapsed) | first'

group by

Group all the jobs by machine

cat output.failed.json | jq '[.jobs[] | {id:.id, name: .name, machine: .settings.MACHINE }] | group_by(.machine)'

Output:

[
  [
    {
      "id": 5895632,
      "name": "sle-12-SP3-Server-DVD-HA-Updates-x86_64-Build20210426-1-qam_ha_rolling_update_node01@64bit",
      "machine": "64bit"
    },
    ...
  ],
  [...],
  [...],
  ...

Count the number of jobs for every machine setting

cat output.failed.json | jq '[.jobs[] | {id:.id, name: .name, machine: .settings.MACHINE }] | group_by(.machine) | ([ .[] | [{machine: (first | .machine), count: length}] | first ])'
howto  tutorial  jq  bash