My research interests include: exploring undiscovered methods aimed at abusing the domain name system,
evaluating the freedom of expression on the web across the globe, and analyzing risks involved using bitcoin.
I like to use machine learning to support my research.
T. Halvorson, J. Szurdi, G. Maier, M. Felegyhazi, C. Kreibich, N. Weaver, K. Levchenko, and V. Paxson, The BIZ Top-Level Domain: Ten Years Later,
in Proceedings of Passive Active Measurements (PAM 2012), Vienna, Austria, March 12-14, 2012.
I have been focusing recently on several aspects of adversarial behavior: exploitation
of the domain name system to steal confidential information; censorship or control of information across the globe by nation
state actors; and risks in cryptocurrencies. By combining those aspects, I hope to model the most prominent types of attackers,
ranging from nation-states to financially-motivated criminals.
I am now looking at new, so far not discovered, areas of typosquatting and different abuses of the domain name system. This new line of research,
potentially affects thousands of organizations and millions of users. I am seeking solutions to two issues:
how to identify the impact these attacks impose on users and to investigate whether somebody is already using
these attacks unnoticed?
I participate in a project aiming to automatically infer how different countries are censoring webpages and why are they
censoring these webpages. One such case is the Chinese government blocking access to Facebook. My contributions here are in
the selection and development of machine-learning algorithms combining our expert knowledge with well-known clustering algorithms
to identify cases of censorship.
Bitcoin was originally designed to be an entirely decentralized crypto currency independent of any central authority,
but nowadays mining pools, mixers and exchanges make Bitcoin more centralized, increasing the probability of fraud.
I am analyzing Bitcoin exchanges and the risk they present to the users.
Previous work focused only on typosquatting targeting the most popular domain names, and did not look at temporal evolution.
Our measurement study published in USENIX Security 2014 showed that less popular domains are actually the primary victims of
typosquatting, and that typosquatting is still a growing phenomenon. Facebook recently won a $2.8 million case against
typosquatters impersonating their website and gained control over about a hundred typosquatting domains (our research
has also shown that over five hundred .com typo domains of facebook.com are registered).
The Long Taile of Typosquatting Domain Names.
In this collaboration, It was my responsiblity to develop a web crawler and a content classifier. The crawler's goal was to download all biz domains (around 2 million), their
namesakes in com zone and a random 2 million com domains. It retrieved the html content directly from the domain and also logged errors (DNS, HTTP, etc.). To
identify parking pages I created two sets of regular expressions. One set for finding pages with patterns matching known parking services (like godaddy)
and the other set to identify characteristics that are generally true for parking pages. The crawler also checked if the biz and its com namesake were serving exactly the same
page. These data were used to identify domains that are not serving content, parked, defensive or identical to com namesake.
The BIZ Top-Level Domain: Ten Years Later.
Under the guidance of Dr. Mark Felegyhazi assistant professor at CrySyS Lab, I researched why people are registering domain names and how cybercriminals are abusing the Domain Name System.
I was interested in three dimensions for categorization: 1. Topic (news, adult, IT etc.), 2. Maliciousness (malicious or benign), 3. Whether the domain is active or passive (parking, defensive, redirected etc.).
I emphasized pro-active detection of malicious domains. At the extent of my thesis I explored lexical properties of the domain names.
Improvements I intend to make in the future: applying machine learning techniques for the lexical analysis and using Zone file information and Whois records.
My M.Sc. thesis in English: Understanding the purpose of domain registrations
Under the guidance of Zoltan Faigl researcher at MIK I researched future architecture for mobile packet switched traffic.
Knowing that in the next decades the broadband mobile packet switched traffic load will be multiplied, presented a key challenge:
to have a scalable architecture for mobile networks. Currently hierarchical and central solutions are supported, but the question is:
whether a more distributed architecture would make mobile networks more scalable. I made a discrete linear programming model for both type of architectures (hierarchical and distributed) to find out which is more scalable in the long-term.
I also developed a software calculating the input data from a network topology and parameters for the discrete linear programming models.