Sunday, 7 June 2020

CYBAR OSINT CTF - Everything Except Two!

It's been awhile since I last blogged about something here. But tonight I think I'd like to share the experience me and my two buddies had during the 24 hours Open Source Intelligence (OSINT) Capture-The-Flag (CTF) online competition, organized by CYBAR, that we participated.


According to its website, "the challenges focus(ed) on IMINT, GEOINT and journalistic investigative techniques surrounding a threat intelligence specialist, self-aware Roombas and containing the Coronavirus pandemic".

This would be the first official public OSINT CTF I participated in and although IMINT and GEOINT is not my strongest skillset, I've demonstrated I was able to handle it after solving an unofficial challenge provided by an instructor during my OSINT SANs course back in 2018, and eventually won the in-class CTF at the end of the course.

So after my buddy shared me the link to join - I accepted and invited another buddy of mine to join. Made up of 3 individuals, the team was called "Apa2Lah" which basically is a Malay language slang for "whatever".

The CTF organized by CYBAR was straight-forward. Solve the challenges, gain as many points as possible and the top few win the cash prizes. Our intention was simply to experience the kind of questions in this CTF and I have to admit, once we scored a couple of points, we got hooked and zoned into the challenges. Something I just wanted to spend like maybe 2-3 hours became 15 hours (the 9 hours was for sleeping and family time).

To cut the story short, we were able to answer all the question within 7 hours except for two questions. Two bloody questions, with the two highest score points, that we spent the remaining hours trying to solve them.

Fig 1. Everything except for two!

These were the two questions we were unable to solve. 

Fig 2. The question that made me explore Texas for 5 hours

Fig 3. Another mind-boggling question

When CYBAR released the answers on its GitHub page after the CTF is over, I was like...what the hell.. Although I admit, given more time and slightly better hint, we could possibly have solved the one with 725 points. Seriously hats off to those who were able to solve this without given any hints or clues. We were looking at the wrong thing at the right place. 

But the one with the 675 points, this was just mind-boggling. The answer to this...well, we never would have guessed that the answer to this question was the answer to this question. The answer had nothing to do with Wuhan, nor Zuckerberg, not a name of a place, not a person or a map.. Take a look at the answer here.

We were able to answer some of the questions that were not in the 'official' manner. For example for the question below:
Fig 4. Getting Alycee's birthday

The official way of getting the answers can be found here. However, we took a different route to get it. We found Alycee's art profile which reveals her month and day of her birthday but not the year. Scrolling to Alycee's Twitter profile, we saw a picture and when we gave an extra focus, we saw a password with the number 89 on it. As many people tend to use birthday year as part of their password combination, we gave it a shot: CYBAR{01/01/1989} and it was correct!

Fig 5. Day and month of her birthday

Fig 6. A possible year of her birthday

This was another interesting question which we managed to pull through - thanks to google. The official way is slightly different from us (official answer here) in that we relied on a different database to locate for Scott. 

Fig 7. Static on the Wire question

We firstly googled for "ham radio" AND "florence alabama" and selected the first website displayed in the result: Then we searched for: Scott and there's the answer!

Fig 8. Relying on a different database website

Another question we did not answer in the official way was this:
Fig 9. Pay $9 to get this answer

So according to the official solution here, in order to get her middle name, you have to "Find the ABN/ACN of the company "CYBAR PROPERTIES PTY LTD". Once found, visit the ASIC register information and purchase the $9 company information record."

We found a domain tied to Lillie Cawthorn. The domain was not accessible and no website was hosted on it. So we ran it through domain records online tool to check if we could get the WHOIS records of this domain. Surprisingly, we managed to find the full name of Lillie and submitted the middle name as the flag! Saved us $9! 

Fig 10. The middle name

Another question which was not straight forward to answer was this:

Fig 11. Fake news question

We initially googled for news article mentioning such incident mentioned in the 'Article text' pictured above and even went to filter for news published on 29th February just to see if anyone published any articles relating to this. But it's no surprise that a question of 650 points does not answered with a mere google search. Took us about an hour or two before we arrived at the answer. The process of getting the answer is not the same as the 'official' way here. Instead what we did was to visit another Australian government public database record and played around with the filters to determine the answer. This answer brought us up briefly to the top 25 spot.

Fig 12. Generating the answers after doing some filtering

After 24 hours, and with 2 flags still not completed, we just had to let it go. But it was good, I believed we learned a lot from this competition. A 3-men team comprised of a cybersecurity analyst, a data scientist and a threat intelligence analyst, who are not experienced in IMINT and GEOINT were able to get 32nd position (tying points with many) out of over 160 teams that joined.

Fig 13. Falling at 32nd place

Fig 14. 161 teams registered for the challenge

Participating in CTFs is one of the ways you could explore your skills, what you are good at and what you need to improve. It's also an opportunity for me to understand what skills or tools needed in order to answer those questions, and if i feel this was useful for me, I would embark into learning that further. If time permits, I would definitely participate in more of these online CTFs - lesser pressure and at the comfort of my own home. 

Saturday, 26 January 2019

Eleven Supernovas and the UiTM Data Leak

On 25 January 2019, the website published an article concerning a data leak that allegedly contain over 1 million student records from the Universiti Teknologi Mara (UiTM). According to Lowyat, details affected include the Student ID, Student Name, MyKAD Number, Address, Email Address, Campus Codes, Campus Names, Program Codes, Course Levels as well as Mobile Phone Numbers.


The data in question was most likely came from a Pastebin link - posted on 24 January. Titled "UiTM Student Data Leak - 11 SUPERNOVA SECURITY", the author declared that this is the "Biggest Malaysian Student Data Leak". Additional samples to another set of 10,000 rows could be downloaded from a link provided in the body of the Paste.

The now-removed Pastebin content

The additional samples of 10k data
Public reporting of this 'breach' has been focusing on the data/details of the leaked content and even UiTM reponded assuring that the leaked data was not from their servers and that their systems are safe and can be trusted - evidenced by their achieved certification "ISMS ISO 27001:2013".


The first focus of this post will be on the hacker group "11 SUPERNOVA SECURITY" and followed by a brief Data Analysis of the leaked sample (downloaded from Anonfile).

Disclaimer: In spite of media reporting of over 1 million details leaked, I have NO access, obtained nor aware if such data have been made available publicly. Hence my analysis are purely based on the sample data which was made public on 24 January.

11 Supernova Security Hacker Group Analysis

A quick search reveals that this group has a social media presence and has been active since the creations of its social media profile on September 2018.

The Hacker Group's Logo

The First Defacement Message on September 2018

The Defacement Message on Victims starting from the beginning of 2019

Analysing the contents and the language used, the actors in this group suggests that they are:
  • Highly-likely based in Malaysia
  • Possibly have an interest in Malaysia current affairs or politics (based on the defacement message)
  • Targeting mostly Malaysian Government (* and Education institutions (*
  • Fans of the One Piece anime specifically the team 11 Supernovas (and also having a member calling him/her self Sir.Crocodile - another character from One Piece)

Eleven Supernovas from the anime One Piece

Based on their social media posts, the group has targeted almost 40 websites since 11 September 2018.

  • 11 websites ending with domain
  • 20 websites ending with domain
  • 6 random websites

It seems that this group has an "intimate" interest with UiTM. Before the leak was announced publicly, on 22 January, the group actually posted saying that they have all the UiTM databases.

Claiming to have the UiTM Database Two Days before the Public Leak

However, hours after the Lowyat's article was published, the group made a statement saying that they denied any involvement in the leak and claimed they have been sabotaged instead.

Claiming that they have been sabotaged

The group continued to state that they are not involved in the breach reported in the news but "other groups" were using their name to distribute the leak. The group also added that the only data that they have was the config host file from the IP that was hosting the websites.

Stressing that the group's name has been misused by "other groups"

In parallel to the claims made by the group, a Twitter account with the handler @Rob1nSecurity was identified posting two links: 1) A new Pastebin link with the exact same content as the one previously removed by Pastebin and 2) a link to the group's Facebook page. The profile was quickly deactivated and no longer to be seen.

The Twitter account created to share the link to the leaked content

The challenge here is attribution. Was the 11 Supernovas Security hacker group responsible just because the Pastebin content said so? Or was the group really got sabotaged by other individuals/groups taking this as an opportunity to leak the data they already had for a while?

Based on the activities on the group's social media page, I could say they are quite interactive and every time they hacked/defaced a website, they would share it quickly on the page to generate attention and gather more fans. This process is pretty much shared by most hacker groups in South East Asia whereby data leaks after defacing a website is not a common procedure compared to hacker groups/hacktivists in Latin America or Europe. One has to ask why didnt 11 Supernovas Security shared the Pastebin link to its page to share to their followers (seeing how they are actively sharing their successful hacks).

With that in mind, the creation of a Twitter account @Rob1nSecurity just to share a new Pastebin link to the samples is definitely questionable (especially its relation to 11 Supernovas Security). Here, we have a social media page that are always posting links of their hacks, and then we have someone who made the effort to create a Twitter profile just to share the links? And no followers? If the intention was to spread the links to a wider audience, then shouldn't it be shared on its other social media page which has more followers and more interactive?

Data Analysis

Please note that all the analysis here were done on the "uitm_gift.csv".

Now it is important to verify that the data are legitimate. There have been many cases where hackers who claimed to have hacked and leaked those details eventually turned out to be either false data or reused data from old breaches.

With the samples in hand, I attempt to validate the data to check if these dataset were reused data or whether these information could be found in other databases or websites that may accidentally expose it due to misconfiguration. However, taken into consideration the limited resources I have at hand, I was unable to find this dataset available anywhere else.

Next step I took is to verify the contents of the data. This is done via selecting samples of the data and verifying manually against open source databases or any publicly available data that could be found on the internet. In this case, I selected two to three individuals and ensure that they are legitimate individuals rather than falsified, randomly generated people.

While I was able to verify that the individual I chose to fact check via OSINT/SOCMINT method,

Verifying the details in the sample leak
I also need to ensure that this person was really a UiTM student and needed to rely on UiTM's data to verify. The challenge was, there wasn't a direct record of this student made available publicly on any UiTM's websites. Using several keywords and data combination, I was able to find a UiTM presentation deck which include details of some UiTM students including the student in question on a website (not a UiTM website). One thing to take note is that the IC number and the Student number here are different than the ones in the sample leak. The student ID in the sample leak was 2015xxx while this finding reveals its 2013xxx instead.

Everything matches except for the Student ID and IC no.
While most of the details in the sample leak concerning this student were true, the two other details (the Student ID and the IC number) were not.

According to the image posted on Lowyat, the third row is the IC no. data which I presumed is the 12 digits MyKad identification number:

Image from LowYat
So it is safe to say that the dataset I'm looking at - "uitm_gift.csv" would mirror the same attributes (where the third row is the IC no. field). However, looking at the IC no. row from the "uitm_gift.csv" data, while majority of the values are of 12 digits, the values were also found to be made of six, eight and eleven digits (with some combination of alphanumeric and special characters). In fact some of them did not look like IC numbers at all, and those with 12 digits, almost all of them have six 0s at the back and some are found to be duplicated.

Questionable values for the IC number row
This led me to assess that this specific dataset may not have come directly from a UiTM's legitimate database (assuming UiTM has a no change in Student ID policy and the IC number in its database was correctly recorded). However another possibility would be the data has been obtained from multiple locations and only to have it being mixed up in the process - hence the wrong details combined during compilation.

Friday, 18 January 2019

Independent Investigation into the 773 Million Records - "Collection #1"

On 17 January 2019, media reports concerning an article posted by Troy Hunt where a huge trove of data with nearly 773 million records were exposed in a giant 87GB archive. On the same day, Brian Krebs posted an article stating that the 773 million password "Megabreach" were most likely to be old data.

So on the night of 18 January, I started to conduct my own research and investigation to find out more about this Megabreach. Using the screenshot of the database trove posted on Troy Hunt's website, I decided to start digging...

Screenshot of the 'archive' posted on

Taking some of the texts in the screenshot, I found a link that many of the domains (listed in the screenshot) were also listed in a list called "dumplist.txt" posted on 28 September 2018.

Thousands of 'hacked DBs' dumps. No data - just the names of the dumps.

According to Troy Hunt, he was directed to a post in a well known hacking forum - that was where the screenshot was taken from. I was able to find the forum and i believed this was the forum Troy Hunt was referring/redirected to.

Posted on a forum known for hacking, cracking and even advertising hacked DBs

Upon further inspection of the screenshot/post, I noticed the naming convention was quite interesting and realized that this was most likely copied and pasted from the dumplist.txt. Two samples of similarities are highlighted below:

Same naming convention and structure

Troy Hunt also posted the list of "allegedly hacked databases" amounting to over 2000 'DBs' on Pastebin:

While I was cross-referencing Troy Hunt's list to dumplist.txt and confirming my view that some (or most) of the hacked DBs were listed in that dumplist.txt using the exact naming convention, mirrored word by word, i also identified a Pastebin content which has most of these DBs listed in its content.

Posted on 29 August 2018

A total of 8301 presumable hacked DBs
This list, not only has the same naming convention, but also contains over 8000 allegedly hacked DBs. This list however was posted on 29 August 2018!

According to Troy Hunt and Brian Krebs, these data from Collection #1 are all a collection or compilation of previous data breaches and advertised as a 'new' database for sale. I took the liberty to research some of the samples 'hacked' DBs to identify if these DBs were indeed not new. To achieve this, I cross reference a hacked DB from the latest Pastebin content posted (by Troy Hunt), then looked at the contents from 2018 posts.

Using kabarindonesia[.]com as a sample

For this particular example, the hacked DB allegedly belonging to kabarindonesia[.]com was present in all the lists. Additionally, this was further confirmed from Breach Aware that kabarindonesia[.]com was one of the victims involved in the data breach of early 2018.

kabarindonesia[.]com pointed as one of the victims of a past data breach

Now further investigation reveals that the screenshot on the hacking forum posted on Troy Hunt's website was not the ground zero. I was able to identify a forum post (a different forum from the one in Troy Hunt's website) that was selling databases similar to the ones in Collection #1.

A typical advertisement in hacking forums

The list of databases for sale in this advertisement

Upon closer inspection of the data advertised, there seems to be a similar offering to the contents of Collection #1. I assess (with medium confidence) that the data advertised in the forum is also possibly included in the bigger data in Collection #1 (right).

Possibly same data, different seller and databases

Now according to Brian Krebs, his interaction with the seller known as "Sanixer" on Telegram reveals that Collection #1 (87GB) was just the beginning of the bigger 993.36GB (almost 1TB) data dump. This was being sold for just $45!

The Telegram User Sanixer (below right)

While Sanixer was offering/advertising the 1TB data for $45, I spotted a forum post who was actually giving this away for free!!! Apparently the forum user was unhappy claiming that Sanixer was sharing his "Infinity Black Combo" in that storage. As an act of retaliation, he posted links to all the 1TB data that can be accessed for free!!

Links to the 1 TB data! 

And to make things worse, other forum users were spotted posting links to these data as well. The post below was posted on 9 January and another 19 January.

Another set of links to the 1 TB data

Another post from a different forum 

Due to ethics, I did not download any of these content, however I took screenshots of these content to show what was being offered.

Screenshot of Collection 1

Screenshot of Collection 2

Screenshot of Collection 3

Screenshot of Collection 4

Screenshot of Collection 5

Screenshot of Antipublic 1

Screenshot of Antipublic 2
In conclusion, I believe that most (if not all) of these data are not new but could be either bought or downloaded from existing databases in the deep web. While some researchers or journalists published this 'breach' as the biggest or largest breach, allow me to recollect your memory to the 1.4 Billion credentials leak of 2017 - reported by 4iQ and the Exploit[.]in compilation of over 592 Million accounts (leaked databases) in the same year. I have a feeling that this 1 TB of data advertised in the underground community is merely a compilation of previous and past years breaches until mid 2018.