Saturday, 26 January 2019

Eleven Supernovas and the UiTM Data Leak

On 25 January 2019, the website Lowyat.net published an article concerning a data leak that allegedly contain over 1 million student records from the Universiti Teknologi Mara (UiTM). According to Lowyat, details affected include the Student ID, Student Name, MyKAD Number, Address, Email Address, Campus Codes, Campus Names, Program Codes, Course Levels as well as Mobile Phone Numbers.

URL: https://www.lowyat.net/2019/177033/over-1-million-uitm-students-and-alumni-personal-details-leaked-online/

The data in question was most likely came from a Pastebin link - posted on 24 January. Titled "UiTM Student Data Leak - 11 SUPERNOVA SECURITY", the author declared that this is the "Biggest Malaysian Student Data Leak". Additional samples to another set of 10,000 rows could be downloaded from a link provided in the body of the Paste.

The now-removed Pastebin content

The additional samples of 10k data
Public reporting of this 'breach' has been focusing on the data/details of the leaked content and even UiTM reponded assuring that the leaked data was not from their servers and that their systems are safe and can be trusted - evidenced by their achieved certification "ISMS ISO 27001:2013".

Source: https://bit.ly/2HxVh7E

The first focus of this post will be on the hacker group "11 SUPERNOVA SECURITY" and followed by a brief Data Analysis of the leaked sample (downloaded from Anonfile).

Disclaimer: In spite of media reporting of over 1 million details leaked, I have NO access, obtained nor aware if such data have been made available publicly. Hence my analysis are purely based on the sample data which was made public on 24 January.

11 Supernova Security Hacker Group Analysis


A quick search reveals that this group has a social media presence and has been active since the creations of its social media profile on September 2018.

The Hacker Group's Logo

The First Defacement Message on September 2018

The Defacement Message on Victims starting from the beginning of 2019

Analysing the contents and the language used, the actors in this group suggests that they are:
  • Highly-likely based in Malaysia
  • Possibly have an interest in Malaysia current affairs or politics (based on the defacement message)
  • Targeting mostly Malaysian Government (*.gov.my) and Education institutions (*.edu.my)
  • Fans of the One Piece anime specifically the team 11 Supernovas (and also having a member calling him/her self Sir.Crocodile - another character from One Piece)

Eleven Supernovas from the anime One Piece

Based on their social media posts, the group has targeted almost 40 websites since 11 September 2018.

  • 11 websites ending with .gov.my domain
  • 20 websites ending with uitm.edu.my domain
  • 6 random websites


It seems that this group has an "intimate" interest with UiTM. Before the leak was announced publicly, on 22 January, the group actually posted saying that they have all the UiTM databases.

Claiming to have the UiTM Database Two Days before the Public Leak

However, hours after the Lowyat's article was published, the group made a statement saying that they denied any involvement in the leak and claimed they have been sabotaged instead.

Claiming that they have been sabotaged

The group continued to state that they are not involved in the breach reported in the news but "other groups" were using their name to distribute the leak. The group also added that the only data that they have was the config host file from the IP that was hosting the websites.

Stressing that the group's name has been misused by "other groups"

In parallel to the claims made by the group, a Twitter account with the handler @Rob1nSecurity was identified posting two links: 1) A new Pastebin link with the exact same content as the one previously removed by Pastebin and 2) a link to the group's Facebook page. The profile was quickly deactivated and no longer to be seen.

The Twitter account created to share the link to the leaked content

The challenge here is attribution. Was the 11 Supernovas Security hacker group responsible just because the Pastebin content said so? Or was the group really got sabotaged by other individuals/groups taking this as an opportunity to leak the data they already had for a while?

Based on the activities on the group's social media page, I could say they are quite interactive and every time they hacked/defaced a website, they would share it quickly on the page to generate attention and gather more fans. This process is pretty much shared by most hacker groups in South East Asia whereby data leaks after defacing a website is not a common procedure compared to hacker groups/hacktivists in Latin America or Europe. One has to ask why didnt 11 Supernovas Security shared the Pastebin link to its page to share to their followers (seeing how they are actively sharing their successful hacks).

With that in mind, the creation of a Twitter account @Rob1nSecurity just to share a new Pastebin link to the samples is definitely questionable (especially its relation to 11 Supernovas Security). Here, we have a social media page that are always posting links of their hacks, and then we have someone who made the effort to create a Twitter profile just to share the links? And no followers? If the intention was to spread the links to a wider audience, then shouldn't it be shared on its other social media page which has more followers and more interactive?

Data Analysis


Please note that all the analysis here were done on the "uitm_gift.csv".

Now it is important to verify that the data are legitimate. There have been many cases where hackers who claimed to have hacked and leaked those details eventually turned out to be either false data or reused data from old breaches.

With the samples in hand, I attempt to validate the data to check if these dataset were reused data or whether these information could be found in other databases or websites that may accidentally expose it due to misconfiguration. However, taken into consideration the limited resources I have at hand, I was unable to find this dataset available anywhere else.

Next step I took is to verify the contents of the data. This is done via selecting samples of the data and verifying manually against open source databases or any publicly available data that could be found on the internet. In this case, I selected two to three individuals and ensure that they are legitimate individuals rather than falsified, randomly generated people.

While I was able to verify that the individual I chose to fact check via OSINT/SOCMINT method,

Verifying the details in the sample leak
I also need to ensure that this person was really a UiTM student and needed to rely on UiTM's data to verify. The challenge was, there wasn't a direct record of this student made available publicly on any UiTM's websites. Using several keywords and data combination, I was able to find a UiTM presentation deck which include details of some UiTM students including the student in question on a website (not a UiTM website). One thing to take note is that the IC number and the Student number here are different than the ones in the sample leak. The student ID in the sample leak was 2015xxx while this finding reveals its 2013xxx instead.

Everything matches except for the Student ID and IC no.
While most of the details in the sample leak concerning this student were true, the two other details (the Student ID and the IC number) were not.

According to the image posted on Lowyat, the third row is the IC no. data which I presumed is the 12 digits MyKad identification number:

Image from LowYat
So it is safe to say that the dataset I'm looking at - "uitm_gift.csv" would mirror the same attributes (where the third row is the IC no. field). However, looking at the IC no. row from the "uitm_gift.csv" data, while majority of the values are of 12 digits, the values were also found to be made of six, eight and eleven digits (with some combination of alphanumeric and special characters). In fact some of them did not look like IC numbers at all, and those with 12 digits, almost all of them have six 0s at the back and some are found to be duplicated.

Questionable values for the IC number row
This led me to assess that this specific dataset may not have come directly from a UiTM's legitimate database (assuming UiTM has a no change in Student ID policy and the IC number in its database was correctly recorded). However another possibility would be the data has been obtained from multiple locations and only to have it being mixed up in the process - hence the wrong details combined during compilation.


Friday, 18 January 2019

Independent Investigation into the 773 Million Records - "Collection #1"

On 17 January 2019, media reports concerning an article posted by Troy Hunt where a huge trove of data with nearly 773 million records were exposed in a giant 87GB archive. On the same day, Brian Krebs posted an article stating that the 773 million password "Megabreach" were most likely to be old data.

So on the night of 18 January, I started to conduct my own research and investigation to find out more about this Megabreach. Using the screenshot of the database trove posted on Troy Hunt's website, I decided to start digging...

Screenshot of the 'archive' posted on Troyhunt.com

Taking some of the texts in the screenshot, I found a link that many of the domains (listed in the screenshot) were also listed in a list called "dumplist.txt" posted on 28 September 2018.

Thousands of 'hacked DBs' dumps. No data - just the names of the dumps.

According to Troy Hunt, he was directed to a post in a well known hacking forum - that was where the screenshot was taken from. I was able to find the forum and i believed this was the forum Troy Hunt was referring/redirected to.

Posted on a forum known for hacking, cracking and even advertising hacked DBs

Upon further inspection of the screenshot/post, I noticed the naming convention was quite interesting and realized that this was most likely copied and pasted from the dumplist.txt. Two samples of similarities are highlighted below:

Same naming convention and structure

Troy Hunt also posted the list of "allegedly hacked databases" amounting to over 2000 'DBs' on Pastebin: https://pastebin.com/UsxU4gXA

While I was cross-referencing Troy Hunt's list to dumplist.txt and confirming my view that some (or most) of the hacked DBs were listed in that dumplist.txt using the exact naming convention, mirrored word by word, i also identified a Pastebin content which has most of these DBs listed in its content.

Posted on 29 August 2018

A total of 8301 presumable hacked DBs
This list, not only has the same naming convention, but also contains over 8000 allegedly hacked DBs. This list however was posted on 29 August 2018!

According to Troy Hunt and Brian Krebs, these data from Collection #1 are all a collection or compilation of previous data breaches and advertised as a 'new' database for sale. I took the liberty to research some of the samples 'hacked' DBs to identify if these DBs were indeed not new. To achieve this, I cross reference a hacked DB from the latest Pastebin content posted (by Troy Hunt), then looked at the contents from 2018 posts.

Using kabarindonesia[.]com as a sample

For this particular example, the hacked DB allegedly belonging to kabarindonesia[.]com was present in all the lists. Additionally, this was further confirmed from Breach Aware that kabarindonesia[.]com was one of the victims involved in the data breach of early 2018.

kabarindonesia[.]com pointed as one of the victims of a past data breach

Now further investigation reveals that the screenshot on the hacking forum posted on Troy Hunt's website was not the ground zero. I was able to identify a forum post (a different forum from the one in Troy Hunt's website) that was selling databases similar to the ones in Collection #1.

A typical advertisement in hacking forums

The list of databases for sale in this advertisement

Upon closer inspection of the data advertised, there seems to be a similar offering to the contents of Collection #1. I assess (with medium confidence) that the data advertised in the forum is also possibly included in the bigger data in Collection #1 (right).

Possibly same data, different seller and databases

Now according to Brian Krebs, his interaction with the seller known as "Sanixer" on Telegram reveals that Collection #1 (87GB) was just the beginning of the bigger 993.36GB (almost 1TB) data dump. This was being sold for just $45!

The Telegram User Sanixer (below right)

While Sanixer was offering/advertising the 1TB data for $45, I spotted a forum post who was actually giving this away for free!!! Apparently the forum user was unhappy claiming that Sanixer was sharing his "Infinity Black Combo" in that storage. As an act of retaliation, he posted links to all the 1TB data that can be accessed for free!!

Links to the 1 TB data! 

And to make things worse, other forum users were spotted posting links to these data as well. The post below was posted on 9 January and another 19 January.

Another set of links to the 1 TB data

Another post from a different forum 

Due to ethics, I did not download any of these content, however I took screenshots of these content to show what was being offered.

Screenshot of Collection 1

Screenshot of Collection 2

Screenshot of Collection 3

Screenshot of Collection 4

Screenshot of Collection 5

Screenshot of Antipublic 1

Screenshot of Antipublic 2
In conclusion, I believe that most (if not all) of these data are not new but could be either bought or downloaded from existing databases in the deep web. While some researchers or journalists published this 'breach' as the biggest or largest breach, allow me to recollect your memory to the 1.4 Billion credentials leak of 2017 - reported by 4iQ and the Exploit[.]in compilation of over 592 Million accounts (leaked databases) in the same year. I have a feeling that this 1 TB of data advertised in the underground community is merely a compilation of previous and past years breaches until mid 2018.