Janos Szurdi

Third year Ph.D. student at Carnegie Mellon University

Currently living in: Pittsburgh, Pennsylvania
E-mail: jszurdi at andrew dot cmu dot edu

I'm lucky to be advised by Nicolas Christin

My research interests include: exploring undiscovered methods aimed at abusing the domain name system, evaluating the freedom of expression on the web across the globe, and analyzing risks involved using bitcoin. I like to use machine learning to support my research.


Short Bio:
I have received my B.Sc. degree in Computer Engineering from the Budapest University of Technology and Economics (BME) in 2010. I was doing research under the supervision of Zoltan Faigl at the Mobile Innovation Center (MIK) from 2010 to 2011 as a student. In 2011-2012 I wrote my thesis under the guidance of Mark Felegyhazi at the Laboratory of Cryptography and System Security (CrySyS Lab), and received M.Sc. degree in Computer Engineering from the Budapest University of Technology and Economics (BME) in 2012 awarded excellent with highest honors. From 2011 to 2012 I worked at Sonrisa Kft. as a front-end and Java developer. I joined the CrySyS Lab as a research assistant for 7 month in 2013. From 2013 to now I'm a Ph.D. student at Cylab under the supervision of Nicolas Christin at Carnegie Mellon University.

Publications:
J. Szurdi, B. Kocso, G. Cseh, J. Spring, M. Felegyhazi, and C. Kanich,
The Long Taile of Typosquatting Domain Names,
In Proceedings of Usenix Security Symposium 2014, Aug 2014.

T. Halvorson, J. Szurdi, G. Maier, M. Felegyhazi, C. Kreibich, N. Weaver, K. Levchenko, and V. Paxson,
The BIZ Top-Level Domain: Ten Years Later,
in Proceedings of Passive Active Measurements (PAM 2012), Vienna, Austria, March 12-14, 2012.

My M.Sc. thesis in English:
Understanding the purpose of domain registrations


Current research:

I have been focusing recently on several aspects of adversarial behavior: exploitation of the domain name system to steal confidential information; censorship or control of information across the globe by nation state actors; and risks in cryptocurrencies. By combining those aspects, I hope to model the most prominent types of attackers, ranging from nation-states to financially-motivated criminals.

- Current research on DNS abuses:

I am now looking at new, so far not discovered, areas of typosquatting and different abuses of the domain name system. This new line of research, potentially affects thousands of organizations and millions of users. I am seeking solutions to two issues: how to identify the impact these attacks impose on users and to investigate whether somebody is already using these attacks unnoticed?

- Research on censorship:

I participate in a project aiming to automatically infer how different countries are censoring webpages and why are they censoring these webpages. One such case is the Chinese government blocking access to Facebook. My contributions here are in the selection and development of machine-learning algorithms combining our expert knowledge with well-known clustering algorithms to identify cases of censorship.

- Research on Bitcoin:

Bitcoin was originally designed to be an entirely decentralized crypto currency independent of any central authority, but nowadays mining pools, mixers and exchanges make Bitcoin more centralized, increasing the probability of fraud. I am analyzing Bitcoin exchanges and the risk they present to the users.

Past research:
+ Earlier research on Typosquatting:

Previous work focused only on typosquatting targeting the most popular domain names, and did not look at temporal evolution. Our measurement study published in USENIX Security 2014 showed that less popular domains are actually the primary victims of typosquatting, and that typosquatting is still a growing phenomenon. Facebook recently won a $2.8 million case against typosquatters impersonating their website and gained control over about a hundred typosquatting domains (our research has also shown that over five hundred .com typo domains of facebook.com are registered). The Long Taile of Typosquatting Domain Names.

+ The BIZ Top-Level Domain: Ten Years Later with researchers at the University of California, San Diego and ICSI, Berkeley:

In this collaboration, It was my responsiblity to develop a web crawler and a content classifier. The crawler's goal was to download all biz domains (around 2 million), their namesakes in com zone and a random 2 million com domains. It retrieved the html content directly from the domain and also logged errors (DNS, HTTP, etc.). To identify parking pages I created two sets of regular expressions. One set for finding pages with patterns matching known parking services (like godaddy) and the other set to identify characteristics that are generally true for parking pages. The crawler also checked if the biz and its com namesake were serving exactly the same page. These data were used to identify domains that are not serving content, parked, defensive or identical to com namesake. The BIZ Top-Level Domain: Ten Years Later.

+ Research at the Laboratory of Cryptography and System Security (CrySyS Lab) in 2011-2012:

Under the guidance of Dr. Mark Felegyhazi assistant professor at CrySyS Lab, I researched why people are registering domain names and how cybercriminals are abusing the Domain Name System. I was interested in three dimensions for categorization: 1. Topic (news, adult, IT etc.), 2. Maliciousness (malicious or benign), 3. Whether the domain is active or passive (parking, defensive, redirected etc.). I emphasized pro-active detection of malicious domains. At the extent of my thesis I explored lexical properties of the domain names. Improvements I intend to make in the future: applying machine learning techniques for the lexical analysis and using Zone file information and Whois records. My M.Sc. thesis in English: Understanding the purpose of domain registrations

+ Research at the Mobile Innovation Center (MIK) in 2010-2011:

Under the guidance of Zoltan Faigl researcher at MIK I researched future architecture for mobile packet switched traffic. Knowing that in the next decades the broadband mobile packet switched traffic load will be multiplied, presented a key challenge: to have a scalable architecture for mobile networks. Currently hierarchical and central solutions are supported, but the question is: whether a more distributed architecture would make mobile networks more scalable. I made a discrete linear programming model for both type of architectures (hierarchical and distributed) to find out which is more scalable in the long-term. I also developed a software calculating the input data from a network topology and parameters for the discrete linear programming models.

Classes:
+ Classes taken at Carnegie Mellon University:
  • Introduction to Computer Security 18-730: A
  • Secure Software Systems 18-732: A
  • Machine learning 10-601: A
  • Machine learning 10-701: A-
+ Classes taken at Budapest University of Technology and Economics (BME):
  • Information Security: Good
  • Security Protocols: Good (Excellent at final exam)
  • Cryptography and Its Applications: Excellent
  • Foundation of Secure Electronic Commerce: Excellent (Excellent at final exam)
  • Secure Communication System Laboratory Exercise I: Good
  • Secure Communication System Laboratory Exercise II: Excellent
  • Practical Network Security: Excellent
  • Newtorking Architectures: Excellent
  • Mobile infocommunication networks: Good
  • Navigation Services and Applications: Excellent

Teaching experience:

  • 2014 spring:Teaching Assistant for Network Security - 18-731 Lectured by prof. Nicolas Christin at Carnegie Mellon University
  • 2014 fall: Teaching Assistant for Introduction to Information Security - 18-631 Lectured by prof. Nicolas Christin at Carnegie Mellon University


Fellowships and grants:

  • 2014 - 23rd USENIX Security Symposium Student Grant
  • 2014 - 4th Bar-Ilan Winter School on Cryptography Student Stipend
  • 2013 - Ann and Martin McGuinn Graduate Fellowship
  • 2013 - Carnegie Institute of Technology Dean's Tuition Fellowship


Talks:

Lightning talk presentation created by Mark Felegyhazi for Usenix Security 2014:

Slides: The "Long Taile" of Typosquatting


When I have a little spare time I enjoy:

Astronomy

  • I am using a 150/750 Newtonian telescope for studying astronomy and to discover our Universe and its phenomena.
  • I am also interested in the astrophysical background of these phenomena.

Martial Arts

  • I have practiced many forms of martial arts since I was eight years old: Judo, Ninjutsu, Southern Mantis Kung Fu, Kempo, Shinkendo etc.

Skateboarding

  • My devoted passion for skateboarding is now 16 years old.