Project Breakdown: Brute

I just released my new project, Brute. I wrote something like this before, but it was so bad that I had to rewrite it.  Hence, the birth of Brute.

If you don't already know what Brute does, then you should definitely go ahead and check out it's GitHub repo that I linked above.

Predecessor

Now you might be wondering what was wrong with its predecessor BruteExpose, well, everything. Here's a list:

  1. I used IPinfo's MMDB instead of their API, which resulted in errors when the IP address did not exist. This is because these databases have to be manually changed or automated with a script. So, this time, I decided to use their API instead and it expectedly solved that issue.
  2. JSON was used as a database instead of an actual database like PostgreSQL or MongoDB. This was a big mistake. I wasn't having writing issues but reading issues. When a JSON reaches a specific size, I learned it starts having problems. I guess this varies depending on your server's specs, but still, I would try to read something from the file only for it to throw me some weird error saying that it couldn't be retrieved.
  3. An ObjectMapper was used to structure the data inside the JSON file. If you had ever gotten BruteExpose to work, you would have realized that  the JSON file was highly organized. It stored the password and its usage, username, hourly, daily, and weekly metrics, as well as the country and username/password combos. This had all been done through an ObjectMapper, and it had polluted the code and made it hard to maintain and find out what went where and what does what.
  4. BruteExpose would listen to a .txt file for SSH attempts. Whenever someone tried to log in through OpenSSH, it would dump their credentials in a .txt, and because the application would be listening, it would log and analyze the credentials and then store it inside the .json file. This was AWFUL. I had first to get the contents of the log, then I had to parse the individual log entry, and then I had to parse the entire log. It was inefficient and frankly inconvenient because I had to ensure the file existed and occasionally empty the .txt log file. After all, you can only store so much in a .txt file until it has reading issues. Look at this monstrosity BruteFileListener.java.

Was there anything good about BruteExpose? Yeah, it used a WebSocket. 🙂 😃

Successor

I decided to write the successor Brute in Rust instead of Java because it's more reliable, memory safe, faster and of course the borrow checker.

What does Brute do differently compared to BruteExpose?

  1. Used Rust instead of Java. Unlike Java, I chose Rust primarily for reliability and memory safety.
  2. Switched to PostgreSQL. I didn't want to make the same mistake again and use JSON as a database, so I switched to a relational database. I also wanted to be able to store data from multiple "farmer" servers that collected the attempts, so that's another reason why.
  3. HTTP server. Instead of just listening to a .txt for changes, it immediately calls /brute/attack/add endpoint that is, in fact, protected by a bearer token, so don't worry. As soon it gets hit it processes the data and immediately calls and sends the data to any clients on the WebSocket.
  4. IPInfo API. Instead of just downloading the data directly and avoiding running into IP(s) that IPinfo does not contain, we just call their endpoint and live worry-free.
  5. Actors. Because a JSON database is no longer used, I don't need to structure my code around it. So, I decided to go with the actors. All I do is assign a handler for a specific task, and BAM, it's done.

Crates

While building the project, I initially used Axum as the web framework but decided to switch to Actix-web because of this issue. I was expecting that to resolve my problem, but, in fact, it did NOT solve the problem. I am still not entirely sure what I did to fix it, but the commit that supposedly fixed it did not because I forgot to refer the correct commit to the issue. I believe it was the first actor I implemented to handle the post-request mechanism that was causing problems.

# main crates
actix = "0.13.5" 
actix-web = { version = "4", features = ["rustls-0_23"] }
sqlx = { version = "0.8.0", features = [ "runtime-tokio", "tls-rustls", "postgres", "derive"] }
actix-web-actors = "4.3.0"
serde = { version = "1.0.130", features = ["derive"] }
serde_json = "1.0.122"
ipinfo = "3.0.0"

HTTP

This time, I decided to use an HTTP server POST/GET request to process my requests. I can support any protocol because all I have to do is call the /brute/attack/add endpoint with the appropriate payload, and it will work!

I'm only going to discuss the /brute/attack/add endpoint, which is indeed a POST request. This is what happens when it is called.

#[post("/attack/add")]
async fn post_brute_attack_add(
    state: web::Data,
    payload: web::Json,
    bearer: BearerAuth,
) -> Result {
    if !bearer.token().eq(&state.bearer) {
        return Ok(HttpResponse::Unauthorized().body("body"));
    }

    if payload.ip_address.eq("127.0.0.1") {
        return Err(BruteResponeError::ValidationError("empty ip or local ip".to_string()));
    } 

    let mut individual = Individual::new_short(
        payload.username.clone(),
        payload.password.clone(),
        payload.ip_address.clone(),
        payload.protocol.clone(),
    );

    individual.validate()?;
    
    match state.actor.send(individual).await {
        Ok(res) => {
            websocket::BruteServer::broadcast(websocket::ParseType::ProcessedIndividual, res.unwrap());
            Ok(HttpResponse::Ok().into())
        },
        Err(er) => Err(BruteResponeError::InternalError(er.to_string())),
    }
}

What is happening?

  1. Because the code is exposed to the public, I needed to add a bearer token, so the first part of the code checks whether or not the bearer token supplied is correct. 
  2. The next part of the code is no longer necessary because .validate() now checks the validity of the ip. All it does is check if the address is local or not.
  3. The validate() function does a range of things. It checks if the strings are empty and within the specified length and then checks if they are within a certain length; if not, it will throw an error. It also checks if the IP is within a specific range of a wide range of IPs to ensure IPInfo can process the IP.

These checks will ensure that the data provided can be processed. Finally, we process the data with state.actor.send(handler), which sends data to the designated handler. After the result is successfully processed, it gets broadcast to all the connected clients via the WebSocket.

Actor

If you look at system.rs, we have an actor named BruteSystem, which handles various handlers such as:

  1. Handler<Individual> (POST)
  2. Handler<RequestWithLimit<ProcessedIndividual>> (GET)
  3. Handler<RequestWithLimit<TopUsername>> (GET)
  4. and many, many more.

Here's the code for Handler<Individual> 

impl Handler for BruteSystem {
    type Result = ResponseActFuture>;

    fn handle(&mut self, msg: Individual, _: &mut Self::Context) -> Self::Result {
        let reporter = self.reporter();
        let fut = async move {
            match reporter.start_report(msg).await {
                Ok(result) => {
                    info!(
                        "Successfully processed Individual with ID: {}. Details: Username: '{}', IP: '{}', Protocol: '{}', Timestamp: {}, Location: {} - {}, {}, {}",
                        result.id(),
                        result.username(),
                        result.ip(),
                        result.protocol(),
                        result.timestamp(),
                        result.city().as_ref().unwrap_or(&"{EMPTY}".to_string()),
                        result.region().as_ref().unwrap_or(&"{EMPTY}".to_string()),
                        result.country().as_ref().unwrap_or(&"{EMPTY}".to_string()),
                        result.postal().as_ref().unwrap_or(&"{EMPTY}".to_string())
                    );
                    Ok(result)
                }
                Err(e) => {
                    error!("Failed to process report: {}", e);
                    Err(BruteResponeError::InternalError(
                        "something definitely broke on our side".to_string(),
                    ))
                }
            }
        };
        fut.into_actor(self).map(|res, _, _| res).boxed_local()
    }
}
..

The reporter.start_report() starts a transaction for many database queries. That's all it really does if you want to know a little more about it look at the code here it is. 

Reporter

Models should implement Reportable<T: Reporter, R> in order to inherit the .report(reporter: T, model: R) so they can call it within the start_report() function.

Here's the implementation for the Reportable trait.

#[allow(async_fn_in_trait)]
pub trait Reportable {
      async fn report<'a>(reporter: &T, model: &'a R) -> anyhow::Result
      where
          Self: Sized;
}

Here's an example of it in use.


impl Reportable, Individual> for Individual {
        async fn report<'a>(
            reporter: &BruteReporter,
            model: &'a Individual,
        ) -> anyhow::Result {
            let pool = &reporter.brute.db_pool;
            let query = r#"
                INSERT INTO individual (id, username, password, ip, protocol, timestamp)
                VALUES ($1, $2, $3, $4, $5, $6)
                RETURNING *
            "#;

            // Generate new ID and timestamp for the new instance
            let new_id = Uuid::new_v4().as_simple().to_string();
            let new_timestamp = SystemTime::now()
                .duration_since(UNIX_EPOCH)
                .unwrap()
                .as_millis() as i64;

            // Execute the query and get the inserted data
            let inserted = sqlx::query_as::<_, Individual>(query)
                .bind(&new_id)
                .bind(&model.username())
                .bind(&model.password())
                .bind(&model.ip())
                .bind(&model.protocol())
                .bind(new_timestamp)
                .fetch_one(pool)
                .await?;

            Ok(inserted)
        }
    }
I think the code itself is pretty explanatory. All it is doing is creating an entry inside the 'individual' database with the specified values.

IPinfo

This time around, I decided to use their API and not their database files, but I was really considering making a standalone API with the database files so I could call it locally. I would have just checked their checksums every 5-10 minutes and, if they changed, downloaded the new database file.

Anyway, I added a recycling system because I use their API, and IPs can change. All it does is check the desired IP, and if the last entry time exceeds 5 minutes, then fetch from the API. If not, then reuse the data. This drastically decreases my usage of the API and in returns saves us a ton of requests. Simple but effective.

Conclusion

The successor is 100000% better than the predecessor, as it should be; the project has been running inside of a docker container for about three days now and has had no errors! So, I can safely say this project was well thought out.

Goodbye.  ☺️☺️☺️


Staying consistent.

Whatever you're doing, whether it's a project, fitness, or w/e you like to do, the best way to stay consistent (for me) is to take away something that I like and reward myself with it after I finish it. 

    Project Breakdown: BruteExpose

    Project inspired by Brute.Fail 

    What is BruteExpose? (Now Known as Live Security Monitor)

    chomnr/live-security-monitor  is an application that, whenever someone attempts to log in to a server that uses OpenSSH, will log the credentials used, the origin (IP & country), the attack protocol, and the date.

    Why I chose Java?

    The only reason I chose Java was that I didn't have a project written in it. If I could go back and rewrite this project, I would probably write it entirely in C or C++. It's also a language I'm highly familiar with because I use to write Minecraft plugins in Java.

    Metrics & Analytics System

    Metrics:

    To collect the actual metrics, I ended up going with IPInfo; I ended up using the .mmdb (MaxMind database), which I regret using because that meant that I would have to constantly update the .mmdb to avoid getting weird errors in my code. Instead, I should have used the IPInfo API. 

    Analytics:

    To interpret the metrics data, I wrote a modular analytics system. You can easily add or remove different stats from the code, and they will be reflected accordingly in the JSON (where the stats are located/tracked).

    Currently supported as of 6/1/2024
    • NumberOfAttemptsOverTime
    • AttackTotalByDayOfWeek
    • DistributionOfAttackProtocols
    • AttackOriginByCountry
    • AttackOriginByIp
    • CommonlyTargetedByCredential

    Integrating your own analytics here's a snippet from the ProtocolBasedMetrics

    ProtocolBasedMetrics.java

    This file will actually help populate the value inside the .json file that keeps track of all these analytics. Many of these metrics can be grouped together into one function because when one of the values is affected, all of them are affected as well. Therefore, we just need a simple populate() function that covers them all. If you need another example when multiple stats are being tracked look at the TimeBasedMetrics folder.

       Private DistributionOfAttackProtocols distributionOfAttackProtocols
       public enum ProtocolBasedType {
          SSH,
          UNKNOWN
        }
        public ProtocolBasedMetrics() {
          distributionOfAttackProtocols = new DistributionOfAttackProtocols();
        }
        public DistributionOfAttackProtocols getDistributionOfAttackProtocols() {
          return distributionOfAttackProtocols;
        }
        public void populate(String name, int amount) {
          getDistributionOfAttackProtocols().insert(name, amount);
        }
        public void populate(ProtocolBasedType type, int amount) {
          getDistributionOfAttackProtocols().insert(type, amount);
        }
        public void populate(String type) {
          getDistributionOfAttackProtocols().insert(type, 1);
        }
        public void populate(ProtocolBasedType type) {
          getDistributionOfAttackProtocols().insert(type, 1);
        }
     

    DistributionOfAttackProtocols.java

    This is actual stat that will be tracked. All we do is a make simple hashmap and our Json Object Mapper will handle the rest.

        private HashMap protocols = new HashMap<>();
        public DistributionOfAttackProtocols() {}
        public void insert(String type, int amount) {
            ProtocolBasedType protocolType = getProtocolByName(type);
            addAttempts(protocolType, amount);
        }
        public void insert(ProtocolBasedType type, int amount) {
            addAttempts(type, amount);
        }
        private void addAttempts(ProtocolBasedType type, int amount) {
            String protocolName = getNameOfProtocol(type);
    
            if (protocols.get(protocolName) == null) {
                protocols.put(protocolName, amount);
            } else {
                protocols.put(protocolName, getAttempts(type)+amount);
            }
        }
        private Integer getAttempts(ProtocolBasedType type) {
            return protocols.get(getNameOfProtocol(type));
        }
        public ProtocolBasedType getProtocolByName(String protocol) {
            if (protocol.equalsIgnoreCase("sshd")) {
                return ProtocolBasedType.SSH;
            }
            if (protocol.equalsIgnoreCase("ssh")) {
                return ProtocolBasedType.SSH;
            }
            // or return UNKNOWN
            return ProtocolBasedType.UNKNOWN;
        }
        private String getNameOfProtocol(ProtocolBasedType type) {
            return type.name();
        }

    BruteMetricData.java

    This is where we will instantiate our analytic/metric.
        private TimeBasedMetrics timeBasedMetrics;
        private GeographicMetrics geographicMetrics;
        private ProtocolBasedMetrics protocolBasedMetrics;
        private CredentialBasedMetrics credentialBasedMetrics;
    
        public BruteMetricData() {
            timeBasedMetrics = new TimeBasedMetrics();
            geographicMetrics = new GeographicMetrics();
            protocolBasedMetrics = new ProtocolBasedMetrics();
            credentialBasedMetrics = new CredentialBasedMetrics();
        }
    
        public TimeBasedMetrics getTimeBasedMetrics() {
            return timeBasedMetrics;
        }
        public GeographicMetrics getGeographicMetrics() { return geographicMetrics; }
        public ProtocolBasedMetrics getProtocolBasedMetrics() { return protocolBasedMetrics; }
        public CredentialBasedMetrics getCredentialBasedMetrics() { return credentialBasedMetrics; }
    
    
    

    Forking OpenSSH

    If you use regular OpenSSH you will notice after trying to dump the password, you will get something like this
    ^M^?INCORRECT^@"
    This is a safety mechanism built into OpenSSH in order to avoid leaking via timing. So, in order to circumvent this, you will need to disable it by removing this line of code. https://github.com/openssh/openssh-portable/blob/df56a8035d429b2184ee94aaa7e580c1ff67f73a/auth-pam.c#L1198

    Now the bad password will not be overrided by OpenSSH.

    Dumping:

    Now we need to dump the credentials to a .txt file named brute_tracker.txt it will dump the username, password, host and the protocol.

    It's a VERY simple script and our Java application will listen to brute_tracker.txt and whenever it is edited it will automatically read the latest entry and store the data.

    #include "library.h"
    #include <security/pam_appl.h>
    #include <security/pam_modules.h>
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <unistd.h>
    
    #define BE_LOG_FILE "/var/log/brute_tracker.txt"
    #define BE_DELAY 700
    
    PAM_EXTERN int pam_sm_authenticate(pam_handle_t *pamh, int flags, int argc, const char **argv) {
        char    *username,
                *password,
                *protocol,
                *hostname;
    
        pam_get_item(pamh, PAM_USER, (void*)&username);
        pam_get_item(pamh, PAM_AUTHTOK, (void*)&password);
        pam_get_item(pamh, PAM_RHOST, (void*)&hostname);
        pam_get_item(pamh, PAM_SERVICE, (void*)&protocol);
    
        // Added a delay to ensure that BruteExpose gets to read the entry.
        // In terms of practicality, I should have wrote the entire program in C,
        // but I am not familiar with the language.
        usleep(BE_DELAY);
    
        FILE *fd = fopen(BE_LOG_FILE, "a");
        if (fd != NULL) {
            fprintf(fd, "%s %s %s %s \n", username, password, hostname, protocol);
            fclose(fd);
        }
    
        return PAM_SUCCESS;
    }
    
    
    After compiling the script, you need to do the following:
    • Drop the .pam file here /lib/x86_64-linux-gnu/security/
    • Nano into common-auth  sudo nano /etc/pam.d/common-auth

    Now you need add libbe_pam.so right before the password gets denied.

    # here are the per-package modules (the "Primary" block)
    auth    [success=2 default=ignore]      pam_unix.so nullok
    # enable BruteExpose.
    auth    optional                        libbe_pam.so
    # here's the fallback if no module succeeds 
    auth requisite pam_deny.so
    What's happening? If pam.unix.so is successful it will skip the next 2 lines. If not it will hit our pam module then the pam_deny module.

    What would I have done differently?

    1. I wouldn't have written a modular analytical system; instead, I would have just hardcoded the analytics for simplicity. This project is not really practical for real-world use cases because we need to introduce a vulnerability to OpenSSH in order to get it working.

    2. I would not have used Java; instead, I would probably written the entire thing in C. Or I would have written the whole thing as a simple PAM module using C. Using two separate languages introduces much more complexity and makes it hard to maintain.

    3. Using IPInfo API instead of their .mmdb. I really should have just used their API for simplicity. I'm not sure why I was so adamant about using their .mmdb. Boy, I probably regret this most because I have to constantly update the .mmdb file manually. with an API, I wouldn't have to do that. Oh, well, you live, you learn.

    4. Use SQLite instead of JSON. JSON is great but not great for a database. I can write so much data, but I can only read so much data. After my .json file reaches a certain point, I can't read any more data from it. But on the other hand, SQLite is great for reading and writing, so yeah, I should have used SQLite instead of JSON.

    This is just a brief project breakdown not as technical as my Ark breakdown but technical enough.



    fitness.

    I started my fitness journey a little over a year ago, around 1.6 years ago, and how much my physique has changed since then A LOT. When I started, I was around ~150 skinny fat; now, I'm still around ~150 but with much more muscle.

    For about a year, I did PPL (Push Pull Legs) (Negligence on the L) as my original routine. Then I started doing PPL x Arnold Split because I was having issues growing out of my chest. But since I started doing the PPL x Arnold split, my chest definitely grew a little, not a lot, just a little. I'll take it over nothing, to be honest. 

    The only supplements I take are Optimum Nutrition Micronized Creatine and Mutant Mass Weight Gainer.

    Here's a photo of me. I think I was 15. in this photo. (This is way before I started weightlifting). 

    5'10 (110 pounds)

    Unfortunately, I don't have any good photos of me right before I started weightlifting.

    Here's a pic I took in March.

    5'10 (155 pounds)