小弟bingbingzi求着我学习Codeql，那就浅学一下，给小弟点面子。

Codeql安装

引擎安装

去https://github.com/github/codeql-cli-binaries/releases 下载对应系统版本的codeql引擎，并且添加到环境变量当中，在cmd中输入codeql可以查看是否设置成功

SDK安装

git clone https://github.com/Semmle/ql

VSCode插件

并且在设置当中设置好codeql可执行文件的路径即可

Codeql使用

使用l4yn3 师傅的靶场 https://github.com/l4yn3/micro_service_seclab/

为了测试刚刚的环境是否可用，用这个靶场来进行测试,命令如下

codeql database create ~/CodeQL/databases/micro-service-seclab-database  --language="java"  --command="mvn clean install --file pom.xml" --source-root=~/CodeQL/micro-service-seclab/

这里使用maven编译的时候会导致不成功，可以在pom.xml文件中加入如下代码

			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-surefire-plugin</artifactId>
				<version>2.22.2</version>
				<configuration>
					<skipTests>true</skipTests>
				</configuration>
			</plugin>

先来看看上面的命令的意思，codeql database create就是创建一个数据库，--language就是语言为java，--command就是用这个命令去编译(python和php脚本不需要)，--source-root就是项目路径

输入命令以后成功即可在vscode中导入数据库，也就是create后面的那个目录

然后使用vscode打开我们下载的sdk目录，在java/ql目录下可以新建个test.ql，输入select "hello world"即可测试，部分因为版本原因会报错，这时候需要先在 ql 文件同级目录下创建一个名为 qlpack.yml，内容如下：

name: mssql
version: 0.0.1
libraryPathDependencies: codeql/java-all

然后右键Codeql:run Query即可运行ql文件

Codeql语法

QL查询的语法结构为

from [datatype] var
where condition(var = something)
select var

经常会用到的ql类库如下:

名称	解释
Method	方法类，Method method表示获取当前项目中所有的方法
MethodAccess	方法调用类，MethodAccess call表示获取当前项目当中的所有方法调用
Parameter	参数类，Parameter表示获取当前项目当中所有的参数

Method

获取项目中定义的所有方法:

import java

from Method method
select method

获取名为getStudent的方法名称

import java

from Method method
where method.hasName("getStudent")
select method.getName(),method.getDeclaringType()

method.getName() 获取的是当前方法的名称

method.getDeclaringType() 获取的是当前方法所属class的名称。

谓词

类似于函数，其实就和函数差不多。对于上面的案例，可以修改一下

import java
predicate isStudent(Method method){
    exists(|method.hasName("getStudent"))
}

from Method method
where isStudent(method)
select method.getName(),method.getDeclaringType()

predicate表示当前方法没有返回值，exists子查询，根据内部子查询返回true或者false，来筛选出数据

Source和Sink

什么是source和sink

在代码自动化安全审计的理论当中，有一个最核心的三元组概念，就是(source，sink和sanitizer)。

source是指漏洞污染链条的输入点。比如获取http请求的参数部分，就是非常明显的Source。

sink是指漏洞污染链条的执行点，比如SQL注入漏洞，最终执行SQL语句的函数就是sink(这个函数可能叫query或者exeSql，或者其它)。

sanitizer又叫净化函数，是指在整个的漏洞链条当中，如果存在一个方法阻断了整个传递链，那么这个方法就叫sanitizer。

只有当source和sink同时存在，并且从source到sink的链路是通的，才表示当前漏洞是存在的。

source通俗的说就是我们可控的变量，Sinkd通俗的说就是危险函数，这也就是代码审计的最基本的可控变量+危险函数。

设置Source

codeql中通过下面的代码来设置source

override predicate isSource(DataFlow::Node src) {}

可以看看我们靶场系统中的source是什么，比如

@RequestMapping(value = "/one")
public List<Student> one(@RequestParam(value = "username") String username) {
    return indexLogic.getStudent(username);
}

采用了springboot框架，懂的师傅都知道，我们需要对/one路由传入username参数，在这，username就是source

@PostMapping(value = "/object")
public List<Student> objectParam(@RequestBody Student user) {
    return indexLogic.getStudent(user.getUsername());
}

在这source就是user，和类型无关，本例中我们设置Source的代码为：

override predicate isSource(DataFlow::Node src) { src instanceof RemoteFlowSource }

这是SDK自带的规则，里面包含了大多常用的Source入口。我们使用的SpringBoot也包含在其中, 我们可以直接使用。

设置Sink

在本案例中，我们的sink应该为query方法(Method)的调用(MethodAccess)，所以我们设置Sink为：

override predicate isSink(DataFlow::Node sink) {
exists(Method method, MethodAccess call |
  method.hasName("query")
  and
  call.getMethod() = method and
  sink.asExpr() = call.getArgument(0)
)
}

上面查询的意思为：查找一个query()方法的调用点，并把它的第一个参数设置为sink。
在靶场系统(micro-service-seclab)中，sink就是：

jdbcTemplate.query(sql, ROW_MAPPER);

Flow数据流

设置好 Source 和 Sink，就相当于搞定了首尾，但是首尾是否能够连通才能决定是否存在漏洞！

一个受污染的变量，能够毫无阻拦的流转到危险函数，就表示存在漏洞！

我们通过使用config.hasFlowPath(source, sink)方法来判断是否连通。

from VulConfig config, DataFlow::PathNode source, DataFlow::PathNode sink
where config.hasFlowPath(source, sink)
select source.getNode(), source, sink, "source"

初步成果

我们使用官方提供的TaintTracking::Configuration方法定义source和sink，至于中间是否是通的，这个后面使用CodeQL提供的config.hasFlowPath(source, sink)来帮我们处理。

我们最终第一版写的demo.ql如下：

/**
 * @id java/examples/vuldemo
 * @name Sql-Injection
 * @description Sql-Injection
 * @kind path-problem
 * @problem.severity warning
 */

import java
import semmle.code.java.dataflow.FlowSources
import semmle.code.java.security.QueryInjection
import DataFlow::PathGraph


class VulConfig extends TaintTracking::Configuration {
  VulConfig() { this = "SqlInjectionConfig" }

  override predicate isSource(DataFlow::Node src) { src instanceof RemoteFlowSource }

  override predicate isSink(DataFlow::Node sink) {
    exists(Method method, MethodAccess call |
      method.hasName("query")
      and
      call.getMethod() = method and
      sink.asExpr() = call.getArgument(0)
    )
  }
}


from VulConfig config, DataFlow::PathNode source, DataFlow::PathNode sink
where config.hasFlowPath(source, sink)
select source.getNode(), source, sink, "source"

CodeQL语法和Java类似，extends代表集成父类TaintTracking::Configuration。

这个类是官方提供用来做数据流分析的通用类，提供很多数据流分析相关的方法，比如isSource(定义source)，isSink(定义sink)

src instanceof RemoteFlowSource 表示src 必须是 RemoteFlowSource类型。在RemoteFlowSource里，官方提供很非常全的source定义，我们本次用到的Springboot的Source就已经涵盖了。

误报解决

在上面的初版代码中，会存在误报，这里的List泛型是Long，不可能注入的

这说明我们的规则里，对于List，甚至List类型都会产生误报，source误把这种类型的参数涵盖了。可以使用isSanitizer

isSanitizer是CodeQL的类TaintTracking::Configuration提供的净化方法。它的函数原型是：

override predicate isSanitizer(DataFlow::Node node) {}

在CodeQL自带的默认规则里，对当前节点是否为基础类型做了判断。

override predicate isSanitizer(DataFlow::Node node) {
node.getType() instanceof PrimitiveType or
node.getType() instanceof BoxedType or
node.getType() instanceof NumberType
}

表示如果当前节点是上面提到的基础类型，那么此污染链将被净化阻断，漏洞将不存在。

所以我们只要在基本的内容上加点特殊的处理即可。

override predicate isSanitizer(DataFlow::Node node) {
    node.getType() instanceof PrimitiveType or
    node.getType() instanceof BoxedType or
    node.getType() instanceof NumberType or
    exists(ParameterizedType pt| node.getType() = pt and pt.getTypeArgument(0) instanceof NumberType )
  }

如果当前node节点的类型为基础类型，数字类型和泛型数字类型(比如List)时，就切断数据流，认为数据流断掉了，不会继续往下检测。

漏报解决

有如下代码

public List<Student> getStudentWithOptional(Optional<String> username) {
    String sqlWithOptional = "select * from students where username like '%" + username.get() + "%'";
    //String sql = "select * from students where username like ?";
    return jdbcTemplate.query(sqlWithOptional, ROW_MAPPER);
}

其中是通过username.get()获取的，应该是这里让他source-sink的链接断了，我们强制接上即可

isAdditionalTaintStep方法是CodeQL的类TaintTracking::Configuration提供的的方法，它的原型是：

override predicate isAdditionalTaintStep(DataFlow::Node node1, DataFlow::Node node2) {}

它的作用是将一个可控节点
A强制传递给另外一个节点B，那么节点B也就成了可控节点。

/**
 * @id java/examples/vuldemo
 * @name Sql-Injection
 * @description Sql-Injection
 * @kind path-problem
 * @problem.severity warning
 */

import java
import semmle.code.java.dataflow.FlowSources
import semmle.code.java.security.QueryInjection
import DataFlow::PathGraph

predicate isTaintedString(Expr expSrc, Expr expDest) {
    exists(Method method, MethodAccess call, MethodAccess call1 | expSrc = call1.getArgument(0) and expDest=call and call.getMethod() = method and method.hasName("get") and method.getDeclaringType().toString() = "Optional<String>" and call1.getArgument(0).getType().toString() = "Optional<String>"  )
}

class VulConfig extends TaintTracking::Configuration {
  VulConfig() { this = "SqlInjectionConfig" }

  override predicate isSource(DataFlow::Node src) { src instanceof RemoteFlowSource }

  override predicate isSanitizer(DataFlow::Node node) {
    node.getType() instanceof PrimitiveType or
    node.getType() instanceof BoxedType or
    node.getType() instanceof NumberType or
    exists(ParameterizedType pt| node.getType() = pt and pt.getTypeArgument(0) instanceof NumberType )
  }

  override predicate isSink(DataFlow::Node sink) {
    exists(Method method, MethodAccess call |
      method.hasName("query")
      and
      call.getMethod() = method and
      sink.asExpr() = call.getArgument(0)
    )
  }
  override predicate isAdditionalTaintStep(DataFlow::Node node1, DataFlow::Node node2) {
    isTaintedString(node1.asExpr(), node2.asExpr())
  }
}


from VulConfig config, DataFlow::PathNode source, DataFlow::PathNode sink
where config.hasFlowPath(source, sink)
select source.getNode(), source, sink, "source"

instanceof

instanceof是用来优化代码结构非常好的语法糖。

我们都知道，我们可以使用exists(|)这种子查询的方式定义source和sink，但是如果source/sink特别复杂（比如我们为了规则通用，可能要适配springboot， Thrift RPC，Servlet等source），如果我们把这些都在一个子查询内完成，比如 condition 1 or conditon 2 or condition 3, 这样一直下去，我们可能后面都看不懂了，更别说可维护性了。
况且有些情况如果一个子查询无法完成，那么就更没法写了。

instanceof给我们提供了一种机制，我们只需要定义一个abstract class，比如这个案例当中的:

/** A data flow source of remote user input. */
abstract class RemoteFlowSource extends DataFlow::Node {
  /** Gets a string that describes the type of this remote flow source. */
  abstract string getSourceType();
}

然后在isSource方法里进行instanceof，判断src是 RemoteFlowSource类型就可以了。

override predicate isSource(DataFlow::Node src) {
    src instanceof RemoteFlowSource
}

Lombok问题

Lombok会自动生成get和set方法，正常代码就是

package com.l4yn3.microserviceseclab.data;
import lombok.Data;

@Data
public class Student {
    private int id;
    private String username;
    private int sex;
    private int age;
}

但是由于lombok的实现机制，导致CodeQL无法获取到lombok自动生成的代码，所以就导致使用了lombok的代码即使存在漏洞，也无法被识别的问题。

还好CodeQL官方的issue里面，有人给出了这个问题的解决办法（[查看](https://github.com/github/codeql/issues/4984#:~:text=Unfortunately Lombok does not work with the CodeQL,the source files before running CodeQL as follows%3A)）。

# get a copy of lombok.jar
wget https://projectlombok.org/downloads/lombok.jar -O "lombok.jar"
# run "delombok" on the source files and write the generated files to a folder named "delombok"
java -jar "lombok.jar" delombok -n --onlyChanged . -d "delombok"
# remove "generated by" comments
find "delombok" -name '*.java' -exec sed '/Generated by delombok/d' -i '{}' ';'
# remove any left-over import statements
find "delombok" -name '*.java' -exec sed '/import lombok/d' -i '{}' ';'
# copy delombok'd files over the original ones
cp -r "delombok/." "./"
# remove the "delombok" folder
rm -rf "delombok"

上面的代码，实现的功能是：去掉代码里的lombok注解，并还原setter和getter方法的java代码，从而使CodeQL的Flow流能够顺利走下去，从而检索到安全漏洞。

参考

https://www.freebuf.com/articles/web/283795.html

ch1e的自留地

Learn Codeql With L4yn3