Project Breakdown: Brute

I just released my new project, Brute. I wrote something like this before, but it was so bad that I had to rewrite it.  Hence, the birth of Brute.

If you don't already know what Brute does, then you should definitely go ahead and check out it's GitHub repo that I linked above.

Predecessor

Now you might be wondering what was wrong with its predecessor BruteExpose, well, everything. Here's a list:

  1. I used IPinfo's MMDB instead of their API, which resulted in errors when the IP address did not exist. This is because these databases have to be manually changed or automated with a script. So, this time, I decided to use their API instead and it expectedly solved that issue.
  2. JSON was used as a database instead of an actual database like PostgreSQL or MongoDB. This was a big mistake. I wasn't having writing issues but reading issues. When a JSON reaches a specific size, I learned it starts having problems. I guess this varies depending on your server's specs, but still, I would try to read something from the file only for it to throw me some weird error saying that it couldn't be retrieved.
  3. An ObjectMapper was used to structure the data inside the JSON file. If you had ever gotten BruteExpose to work, you would have realized that  the JSON file was highly organized. It stored the password and its usage, username, hourly, daily, and weekly metrics, as well as the country and username/password combos. This had all been done through an ObjectMapper, and it had polluted the code and made it hard to maintain and find out what went where and what does what.
  4. BruteExpose would listen to a .txt file for SSH attempts. Whenever someone tried to log in through OpenSSH, it would dump their credentials in a .txt, and because the application would be listening, it would log and analyze the credentials and then store it inside the .json file. This was AWFUL. I had first to get the contents of the log, then I had to parse the individual log entry, and then I had to parse the entire log. It was inefficient and frankly inconvenient because I had to ensure the file existed and occasionally empty the .txt log file. After all, you can only store so much in a .txt file until it has reading issues. Look at this monstrosity BruteFileListener.java.

Was there anything good about BruteExpose? Yeah, it used a WebSocket. 🙂 😃

Successor

I decided to write the successor Brute in Rust instead of Java because it's more reliable, memory safe, faster and of course the borrow checker.

What does Brute do differently compared to BruteExpose?

  1. Used Rust instead of Java. Unlike Java, I chose Rust primarily for reliability and memory safety.
  2. Switched to PostgreSQL. I didn't want to make the same mistake again and use JSON as a database, so I switched to a relational database. I also wanted to be able to store data from multiple "farmer" servers that collected the attempts, so that's another reason why.
  3. HTTP server. Instead of just listening to a .txt for changes, it immediately calls /brute/attack/add endpoint that is, in fact, protected by a bearer token, so don't worry. As soon it gets hit it processes the data and immediately calls and sends the data to any clients on the WebSocket.
  4. IPInfo API. Instead of just downloading the data directly and avoiding running into IP(s) that IPinfo does not contain, we just call their endpoint and live worry-free.
  5. Actors. Because a JSON database is no longer used, I don't need to structure my code around it. So, I decided to go with the actors. All I do is assign a handler for a specific task, and BAM, it's done.

Crates

While building the project, I initially used Axum as the web framework but decided to switch to Actix-web because of this issue. I was expecting that to resolve my problem, but, in fact, it did NOT solve the problem. I am still not entirely sure what I did to fix it, but the commit that supposedly fixed it did not because I forgot to refer the correct commit to the issue. I believe it was the first actor I implemented to handle the post-request mechanism that was causing problems.

# main crates
actix = "0.13.5" 
actix-web = { version = "4", features = ["rustls-0_23"] }
sqlx = { version = "0.8.0", features = [ "runtime-tokio", "tls-rustls", "postgres", "derive"] }
actix-web-actors = "4.3.0"
serde = { version = "1.0.130", features = ["derive"] }
serde_json = "1.0.122"
ipinfo = "3.0.0"

HTTP

This time, I decided to use an HTTP server POST/GET request to process my requests. I can support any protocol because all I have to do is call the /brute/attack/add endpoint with the appropriate payload, and it will work!

I'm only going to discuss the /brute/attack/add endpoint, which is indeed a POST request. This is what happens when it is called.

#[post("/attack/add")]
async fn post_brute_attack_add(
    state: web::Data,
    payload: web::Json,
    bearer: BearerAuth,
) -> Result {
    if !bearer.token().eq(&state.bearer) {
        return Ok(HttpResponse::Unauthorized().body("body"));
    }

    if payload.ip_address.eq("127.0.0.1") {
        return Err(BruteResponeError::ValidationError("empty ip or local ip".to_string()));
    } 

    let mut individual = Individual::new_short(
        payload.username.clone(),
        payload.password.clone(),
        payload.ip_address.clone(),
        payload.protocol.clone(),
    );

    individual.validate()?;
    
    match state.actor.send(individual).await {
        Ok(res) => {
            websocket::BruteServer::broadcast(websocket::ParseType::ProcessedIndividual, res.unwrap());
            Ok(HttpResponse::Ok().into())
        },
        Err(er) => Err(BruteResponeError::InternalError(er.to_string())),
    }
}

What is happening?

  1. Because the code is exposed to the public, I needed to add a bearer token, so the first part of the code checks whether or not the bearer token supplied is correct. 
  2. The next part of the code is no longer necessary because .validate() now checks the validity of the ip. All it does is check if the address is local or not.
  3. The validate() function does a range of things. It checks if the strings are empty and within the specified length and then checks if they are within a certain length; if not, it will throw an error. It also checks if the IP is within a specific range of a wide range of IPs to ensure IPInfo can process the IP.

These checks will ensure that the data provided can be processed. Finally, we process the data with state.actor.send(handler), which sends data to the designated handler. After the result is successfully processed, it gets broadcast to all the connected clients via the WebSocket.

Actor

If you look at system.rs, we have an actor named BruteSystem, which handles various handlers such as:

  1. Handler<Individual> (POST)
  2. Handler<RequestWithLimit<ProcessedIndividual>> (GET)
  3. Handler<RequestWithLimit<TopUsername>> (GET)
  4. and many, many more.

Here's the code for Handler<Individual> 

impl Handler for BruteSystem {
    type Result = ResponseActFuture>;

    fn handle(&mut self, msg: Individual, _: &mut Self::Context) -> Self::Result {
        let reporter = self.reporter();
        let fut = async move {
            match reporter.start_report(msg).await {
                Ok(result) => {
                    info!(
                        "Successfully processed Individual with ID: {}. Details: Username: '{}', IP: '{}', Protocol: '{}', Timestamp: {}, Location: {} - {}, {}, {}",
                        result.id(),
                        result.username(),
                        result.ip(),
                        result.protocol(),
                        result.timestamp(),
                        result.city().as_ref().unwrap_or(&"{EMPTY}".to_string()),
                        result.region().as_ref().unwrap_or(&"{EMPTY}".to_string()),
                        result.country().as_ref().unwrap_or(&"{EMPTY}".to_string()),
                        result.postal().as_ref().unwrap_or(&"{EMPTY}".to_string())
                    );
                    Ok(result)
                }
                Err(e) => {
                    error!("Failed to process report: {}", e);
                    Err(BruteResponeError::InternalError(
                        "something definitely broke on our side".to_string(),
                    ))
                }
            }
        };
        fut.into_actor(self).map(|res, _, _| res).boxed_local()
    }
}
..

The reporter.start_report() starts a transaction for many database queries. That's all it really does if you want to know a little more about it look at the code here it is. 

Reporter

Models should implement Reportable<T: Reporter, R> in order to inherit the .report(reporter: T, model: R) so they can call it within the start_report() function.

Here's the implementation for the Reportable trait.

#[allow(async_fn_in_trait)]
pub trait Reportable {
      async fn report<'a>(reporter: &T, model: &'a R) -> anyhow::Result
      where
          Self: Sized;
}

Here's an example of it in use.


impl Reportable, Individual> for Individual {
        async fn report<'a>(
            reporter: &BruteReporter,
            model: &'a Individual,
        ) -> anyhow::Result {
            let pool = &reporter.brute.db_pool;
            let query = r#"
                INSERT INTO individual (id, username, password, ip, protocol, timestamp)
                VALUES ($1, $2, $3, $4, $5, $6)
                RETURNING *
            "#;

            // Generate new ID and timestamp for the new instance
            let new_id = Uuid::new_v4().as_simple().to_string();
            let new_timestamp = SystemTime::now()
                .duration_since(UNIX_EPOCH)
                .unwrap()
                .as_millis() as i64;

            // Execute the query and get the inserted data
            let inserted = sqlx::query_as::<_, Individual>(query)
                .bind(&new_id)
                .bind(&model.username())
                .bind(&model.password())
                .bind(&model.ip())
                .bind(&model.protocol())
                .bind(new_timestamp)
                .fetch_one(pool)
                .await?;

            Ok(inserted)
        }
    }
I think the code itself is pretty explanatory. All it is doing is creating an entry inside the 'individual' database with the specified values.

IPinfo

This time around, I decided to use their API and not their database files, but I was really considering making a standalone API with the database files so I could call it locally. I would have just checked their checksums every 5-10 minutes and, if they changed, downloaded the new database file.

Anyway, I added a recycling system because I use their API, and IPs can change. All it does is check the desired IP, and if the last entry time exceeds 5 minutes, then fetch from the API. If not, then reuse the data. This drastically decreases my usage of the API and in returns saves us a ton of requests. Simple but effective.

Conclusion

The successor is 100000% better than the predecessor, as it should be; the project has been running inside of a docker container for about three days now and has had no errors! So, I can safely say this project was well thought out.

Goodbye.  ☺️☺️☺️