Implement everything by yourself?

Description

You can only trust code you wrote yourself? At least one of the SFL tutors seems to think using self-made cryptographic functions is a good idea O_o

You've stolen a hashed message. Can you "unhash" it? You do know the SFL team is using the faculty's "Gitea" service for there software development. Maybe they left something there?

Message: \tbt.aLchkb}(#Wias"_itShVe__MlHe

Unfortunately, the hash function is not only a bad one-way function. It also allows for several collisions. You may have to slightly tweak/fix the flag once you've found it; only one flag is correct, not its collisions.

Solution

How does the hash work

At first we take a look at the mentioned "Gitea service". If we scroll down or use the search function we will find a repository "marvin.weiler/sfl_2022_2023_hash". Inside this repo there are three files: README.md, LICENSE and hash.py. We are only interested in the file hash.py

This file was obfuscated. So let's take a look what is going on:

  • In line 9 the value sl is set to the length of the input_string. The eval code evaluates len(input_string).
  • Line 11- 20 assures that the input_String has always 32 chars.
  • Line 25- 29 creates two arrays, h665 for each character for the first half of the string, and sta with the ord() value for each character of h665
  • Line 53- 59 f2(v1, v2). v1 is a character and v2 a number. The character v1 is shifted v2 times. If the resulting character v3 is outside the printable range \( 32 \leq v3 \leq 126 \) it gets shifted again and the current shifting iteration gets added.
  • Line 33 - 40 are iterating over the second half of the input_string and applies the shift function f2 to each character. This is done exactly once because function f3 will always return false.
  • Line 43- 48 the two half strings are reassembled to one. It is always firstly a shifted character from the first half used then the not shifted from the second half. This is done for all characters.
  • Line 51 replaces the space " " character with an "_" and returns the string.

Reverse the hash

To reverse the hash we first split the output string into the two arrays that they were before line 45.

#h664 = first_half
#h665= second_half
first_half  = [\b.Lhb(Wa"iSV MH]
second_half = [ttack}#is the le]

The second half of the string ends with "ttack}" the rest of the word is padding applied in 18-20.

We do now have to find an input that produces the string "\b.Lhb(Wa"iSV MH" when it gets shifted by the second half. E.g. the char "\" has the UTF-8 value of 92 and got shifted by "t" which has value 116. So we need to substract "\" - "t" which is \( 92 - 116 = -24 \). Ascii and Unicode characters cannot have negative values, and we see that in line 54 the shifted value has a upper bound by ** %126 ** so we also wrap around and calculate \( 126 - 24 = 102 \) and the character "f".

You can repeat this for every character to get a valid input for a same hash value. But the challenge says:

Unfortunately, the hash function is not only a bad one-way function. It also allows for several collisions. You may have to slightly tweak/fix the flag once you've found it; only one flag is correct, not its collisions.

So there are multiple possible values. This happens because we cannot be sure how often the loop in line 56- 58 shifted the character. So some characters need to be shifted multiple times till the flag is accepted.

Sourcecode for hash.py

This is the orignal code from the hash.py file:

import math

flag = "fl4g{collision_attack}"
comment = "#input_string must only contain ascii characters"
comment2 = "#is the length of input_string important?"
def hash(input_string):

    string_lenght = len(input_string)

    #max 32 chars
    if string_lenght > 32:
        input_string = input_string[:32]
        string_lenght = 32


    #Padding
    if string_lenght != 32:
        l = 32 - string_lenght
        input_string = input_string + comment2[:l]
    string_lenght = len(input_string)

    
    first_half = input_string[:int((string_lenght / 2))]
    second_half = input_string[int(string_lenght / 2) : string_lenght]

    amount_to_shift = [] #How far should each char be shifted
    for char in second_half:
        amount_to_shift.append(ord(char))

    for i in range(int(string_lenght / 2)):
        first_half = (first_half[:i] +
                choose_char(first_half[i], amount_to_shift[i]) +
                first_half[i + 1:])
    i = 0
    input_string = ""
    while i < string_lenght / 2:
        input_string += first_half[i]
        input_string += second_half[i]
        i = i+1


    return input_string.replace(" ", "_")

def choose_char(char, shift):
    num = (ord(char) + shift) %126
    it = 0
    while (num < 32 or num > 126):
        num = (num + ord(char) + shift + it) % 126
        it = it +1
    return chr(num)

 
flag_hash = hash(flag)
print("Orig_hash {" ,flag_hash, "}")