Outreachy Blog Post #5: CodeQL: To Hunt A Bug.

Photo by Kony on Unsplash

Outreachy Blog Post #5: CodeQL: To Hunt A Bug.

Github's Trophy Bug Hunter.

·

3 min read

You know how cool Sniper's in the Military are; they are the true hunters. Oh how I love seeing them in action. Modern assassins, if you know what I am talking about.

CodeQL is a modern bug assassin. Being Github's own pretty Assassin Queen, I'm talking ANNA POLIATOVA level.

CodeQL is a code analysis engine developed by github, for identifying bugs; performance issues, security issues, code quality issues, etc. it can be incorporated into a CI/CD pipelineadding it to your codebases workflow by enabling it in Github Actions. It can also be used locally with VSCode, to analyze source code. An equivalent of this tool also exists in the form of a command-line tool called CodeQL-CLI.

It makes use of things called "Queries", these queries are written in a query language called "QL", hence the name "CodeQL".

CodeQL supports the analyses of the following languages;

  • C/C++

  • C#

  • Golang

  • Kotlin

  • Javascript

  • Java

  • Python

  • Ruby

  • Swift

  • Typescript

To analyse source code locally using VSCode or CodeQL-CLI, the source code has to be made into a CodeQL darabase; without doing this, querying the source code would not be possible.

Code scanning; enabling CodeQL in your github repo, one can either use the default setup, or the advanced setup. Both of these setup options have got a configuration where one can choose a suite/group of queries we want for CodeQL to always query our codebase with; the Default query suite or the Security-Extended Query suite. The default suite has got less false positives and more precision than the Security-Extended suite; and the Security-Extended suite has got queries with higher severity than the default suite—This is a vital information, so take note.
Cf: https://docs.github.com/en/code-security/code-scanning/managing-your-code-scanning-configuration/codeql-query-suites

Having danced with this beauty for a couple of times, I figured it is quite easy to use; direct, simple and effective, save for some false positives which can be spotted as long as one understands the details of the presumed issue and is quite adept in the specific language being analyzed. Also, it can be heavy if your spec is not it's type ;-) (your PC spec). I mean, using a low memory system will make analysis less efficient and the analysis could even be halted if memory is too low.

I have been succesful in identifying a couple of bugs in the Suricata codebase using CodeQL, and have fixed some of these issues, while others are being fixed, especially the ones identified after my PR for the inclusion of the Security-extended query suite into the CodeQL configuration was merged (https://github.com/OISF/suricata/pull/10259).

OK!!! There it is, in black and white.
This sage will be back with more magic spells to share.
Until then, Query your codebase ;-).
Peace.