CS294-28 Internet/Network Security Projects
General information
Your term project should address a research issue in
network security, interpreted broadly (it need not be a topic discussed
in class). The goal in terms of depth and quality is to develop the
effort to a degree that
at least would merit a workshop-caliber publication.
Most projects will fall into one of the following general
categories:
- Analyze. Undertake a substantive analysis/assessment of
security issues for a given network system. For example, to what degree
does Skype expose its users to remote compromise? Preserve their privacy?
Admit misuse of the system to aid in denial-of-service attacks? What
is its trust model? What steps could be taken to strength Skype in
this regard? What can you say about the expected efficacy of those steps?
(Note:
it needn't be an application nor involve end systems; you can consider
schemes relevant to other layers of the networking stack, or that
concern infrastructure/internal components.)
- Measure. Empirically explore and characterize a network
security issue. For example, under what circumstances and to what degree
do nodes in the Tor anonymizing network alter the content that passes
through them?
- Innovate. Devise and analyze (or possibly implement)
a new mechanism or technique. For example, this could be a new way to
protect servers from application-level denial-of-service attacks, or
a new detector for some type of malicious activity.
- Test. Take a result in the literature and undertake
a thoughtful and meaningful reproduction of it to assess to what degree
you obtain the same results, and why.
- Attack. Develop a new threat. Assess its efficacy,
countermeasures/defenses, and likely "arms race" evolution.
- Research. Conduct a deep, thoughtful literature survey
of a particular area in network security ("research" as a verb). Assess
the strengths and weaknesses of the published results in the area, delimit
the boundaries of the state of the art, identify themes and abstractions,
frame avenues for future work.
I encourage you to find a topic of interest to you; feel free to be creative
in selecting a project topic. You're welcome to pick a topic that is
connected to your current research, and I'm happy to discuss possible
topics with you in advance. See below for a list of some possible ideas
(just meant as grist). Often you can pursue the same project jointly
for two different classes. If this would be the case, you need to discuss
it first both with me and with the other instructor(s).
Preferably you should work in a team of two, though individual projects
are okay too. Team projects will be held to a somewhat higher standard.
If you want to work in a team larger than two, first talk with me about
why this is appropriate and how the work will be divided.
The process
(Note that the following dates are at this point tentative.)
- Write a concise (approximately 1 page)
project proposal that clearly states the problem you will be tackling,
the key challenges for new research, and your plan of attack (including
milestones and dates). If there are any special resources you might need,
flag these. Mention any relevant papers of which you are already aware.
The project proposal is due the evening of Friday Sep 17.
- As part of turning in the project proposal, schedule a
meeting with me to discuss your idea.
-
Put together a related work writeup. This writeup should
reflect a solid grounding in the literature relevant for your project,
written in a style similar to the related work sections in
the papers we've been reading. For
each item of previous related work, briefly discuss the contributions
of the paper, its relevance to your undertaking, and (if appropriate)
in what ways it differs from your effort.
In general, you can tell if your related work framing is possibly
too narrow is by looking at the citations of those papers you currently
discuss. If you see that they cite tons more work that at least from their
titles sounds like they could be germane, then it's your task as a researcher
to then track those down - ideally, all of the ones that sound like
they could be relevant - and assess which ones you indeed need to read and
absorb. Note, read-and-absorb here can run the range from reading in
detail, similar to how you read papers for the class, to just reading
sections or such, as you gauge relevance.
You then recurse on the citations in those papers, repeating
the process until you converge by not finding any new papers, and/ or the
ones you find become only lightly related.
At this point, you've then mastered the full literature on the
area you're working in (and usually gotten a bunch of new ideas
about what to try or, often more important, not try).
When gathering these related papers, you may run across some that require
payment through portals such as those run by ACM or IEEE. Note that UCB
has site licenses for most of these libraries, so you should be able
to readily fetch them using a campus machine/address without needing to provide
payment.
The related work writeup is due the evening of Friday Oct 15.
- Write up a short status report explaining what work you have
completed, what remains, and any open issues (such as problems you haven't
figured out how to solve or additional resources you require). Begin
your report with a sketch of your project so I'm reminded of the context
while reading it.
The status report writeup is due the evening of Monday Nov 8.
- As part of turning in the status report writeup, also schedule a
meeting with me to discuss your report.
-
Prepare a class presentation. These will be on
Wed Dec 1, Fri Dec 3, and Mon Dec 6
(note special time). 24 hours prior to the class
in which you'll be presenting, mail out a brief (~1-2 paragraphs)
description of your project to the class mailing list.
There's an art to scoping a presentation to effectively make use of
the available time. You need to gauge what context your particular audience
(here, this means your classmates) already has regarding the problem space
your work addresses, and not spend time developing that broader context;
at most, just remind them. However, it will (better!) be the case that
your particular area has depth beyond what the average audience member
knows about. You do need to frame this additional context, both
in terms of what makes the problem interesting and significant, and how
the problem space has been previously viewed in terms of prior work and
the assumptions this work reflects.
Note: depending on class size, it's possible that instead of
presentations we will have a poster session. I will determine
which of these well in advance of the presentation dates.
-
Finally, your project report is due on Monday Dec 13, at 1PM.
No extensions will be granted.
The final report
You are expected to write a technical paper, in the style
of a conference submission, on the research you have done.
State the problem you're addressing, motivate why it is an
important or interesting problem, present your research
thoroughly and clearly, compare to any related work that
may exist, summarize your research contributions,
and draw whatever conclusions may be appropriate.
There is no page limit (either minimum or maximum),
but reports will be evaluated on
technical content and not on length.
Here are some pointers regarding writing technical papers:
- Use active voice (verbs that convey action) rather than
passive voice ("is" verbs). For example, "We need to consider the problem
of spoofing" rather than "The problem of spoofing is something to consider".
Passive voice, especially lengthy sequences (occasional uses are fine),
reads stilted, and makes it harder for the reader to keep their train of
thought focused.
- I am a fan of having the Related Work section at the beginning of
a paper rather than the end, so the reader has the full context available
to them from the get-go. As a reviewer, I sometimes gain a perception
(perhaps unfairly) that papers deferring Related Work until the end
are trying to convey a stronger claim to novelty than the paper in
fact merits.
- Defer details of lesser importance (those that the reader doesn't
really need to follow the technical development, or those that require
in-depth discussion that distracts from the main point at hand) to one
or more Appendices.
- Avoid sans serif fonts (those without any tails or hooks
on the letters) such as Helvetica. These are significantly harder to read
than serif fonts such as Times Roman.
- Be sure to number your pages. That allows the reviewer to refer
to particular text in their comments.
- Make your figures large enough so that they clearly show the points
you want them to illustrate. In an actual conference submission, you
often can get into trouble with page limitations, and then have to squeeze
down figures to stay within the length requirements. But that's not
an issue for your class project writeup, for which you should
include all of the relevant analysis. For your writeup, if something feels
like a detail, you can defer it to an appendix. The same goes for figures -
generate all that have relevance. (Obviously, use some judgment here.
If you could generate 100 figures for different choices of parameters,
it's not useful to include all of them. In fact, it's important to instead
determine how to convey the information more concisely.)
- Likewise, for a writeup like this where you don't have space
restrictions, use an 11-point font or larger, so the text
is physically easy to read.
- If possible, your figures should work if printed in black-and-white,
since some reviewers do their reviewing using hardcopy. (I'm one of them,
but as noted below if your figures need color, let me know and I'll print
them accordingly.)
- Spell-check and proofread to correct grammar errors. If your text has
too many of these, it can cost you in terms of assumptions the reviewers
may make regarding your general attention to detail (and thus whether they
trust that you conducted your analyses carefully). If your English
is not fluent, it's worth getting help with it. Unfortunately, for
"blind reviewing" (conferences that require anonymized submissions),
deficiencies in English can prejudice the reviewers towards expecting
inferior quality in the work. Conversely, engaging writing can really
help in subtly suggesting to the reviewer that you know what you're doing
and they should trust you on unclear fine points.
- Some nits to proofread for:
- Using the same word twice in the same sentence, or (sometimes)
in nearby sentences.
- Typos of repeated words, such as "and and".
- Regularize your bibliography. Avoid redundant dates like
"Proceedings of the 2003 Foobar Conference, March 2003".
Beware of how Bibtex can mangle your capitalization.
- If using LaTeX, make sure you use open-quotes (``) and
close-quotes (' ') rather than double-quotes ("), which
it renders in only one direction.
These might seem trivial, but they can convey to the reader a sense of
whether you took the time to carefully read over your draft.
- Do not reuse text or figures without citation. Doing so is plagiarism,
and can kill your credibility with the reviewer if they detect it.
- If you are not familiar with writing conference-style papers
in computer science, the following resources
(from David Wagner's CS 261 course)
may help:
Please submit either HTML or PDF, via email attachment.
I generally review papers from hardcopy, so it needs to print clearly and
with sufficiently large text and figures. If you use color figures, mention
that in your cover note so I can send it to an appropriate printer.
Some possible ideas
Here are a number of project ideas, some fairly specific and others more
general. They are meant to stimulate your thinking and you don't have
to select one. Some of them have particular considerations noted in
italics.
- To what degree can you assess the accuracy of blacklist feeds (bad IP
addresses, URLs, domains, or such)? How effective/evadable are
they? To what degree are different feeds redundant?
- Spammers have been found to sometimes hijack BGP address blocks in order
to briefly send from someone else's address space. How prevalent
is this activity today?
- Work out an architecture for providing the Internet (or a future
version of it) with solid attribution properties, while also
preserving privacy when not in conflict with legal requirements.
- How well can you detect web-based attacks using network monitoring?
Build detectors for attacks such as XSS, CSRF, or SSL stripping
and implement them for the
real-time Bro system developed by my research group. Assess
detector efficacy in terms of false positives and false negatives.
As the detectors mature, I can provide results of running
them against large, live traffic streams.
- How serious is the problem of blog spam? What might be done to
detect it?
- Is robots.txt actually honored? It seems it can't be, as it would
otherwise provide a very easy way for malicious web sites to
avoid inspection from folks like Google or Bing. If it isn't
honored, can you still identify crawlers?
- To what extent can you fingerprint individual users by the timing
of their typing/packets during interactive network sessions (such
as logging in to a remote site via SSH)? This would be a
continuation of a project begun by a previous student. I can
provide an extensive dataset of hundreds of users typing over
Telnet sessions, for which ground truth is available.
- Analyze the UCSD/CAIDA "backscatter" data to characterize how often
a DDoS attack results in ISP's removing connectivity, based on
a change from observing RSTs/SYN-ACKs to ICMP Unreachables.
How long does it take ISP's to "pull the plug"? Along with
the backscatter data, UCSD also has a trace of a DDoS attack as
seen by the target, which could be analyzed instead or in
addition.
- Flushing out illicit snooping: if you mention a URL in a supposedly
private context (such as an anonymous Tor circuit, or in email
sent via GMail, or an IRC chat), does one of the parties facilitating
the communication (e.g., Tor exit node; Google; the IRC server
operator) ever investigate the mention? This project
has some risk of producing only negative results. However,
a positive result would make a big splash. Thus, it would
behoove one to make some up-front measurements to assess
viability. Also, this project might be pursued working in
collaboration with Prof. Stefan Savage of UCSD.
- When monitoring a site's access link, usually you expect to only
see outgoing DNS requests from the site's internal name servers,
which the site's hosts are supposed to use. If you see a lookup
coming directly from an internal host, it may reflect malware that
has reconfigured a system to use an external resolver ... or it
might just reflect a misconfiguration. How can we determine if
it reflects a problem? This would be a
continuation of a project my group is pursuing, where the
notion is to leverage a large list of open resolvers to
determine whether results returned for such lookups likely reflect
localization, or malice.
- To what degree can DNS registry information (e.g., "whois" records)
be used to infer how dangerous a given address/domain is likely
to be? This project would be in collaboration with a postdoc
working in my research group, and a continuation of an existing
effort.
- Study the phenomenon of email spam that attempts to recruit "mules"
for laundering money and/or fraudulently purchased products.
What can you determine about the different recruiters and
the patterns of interest evinced in the recruiting messages?
This would be working with a CS294-28 student, Albert Kim,
who has already pursued groundwork for this project with
my research group. One angle you
could pursue here would be to use natural language technology
to build a system that can construct numerous replies to spam
emails such that they appear to be be from different individuals.
- How has network scanning evolved over the past 15 years? What
about use of services, and to what degree can service "flux"
be used to spot malicious services (such as newly installed
backdoors)? For this project I can provide mediated access
to a very large longitudinal dataset of connections seen
at the Lawrence Berkeley National Laboratory.
- Design a Javascript rewriter - an in-path network element or a
browser prefilter that modifies Javascript transferred in Web items.
Evaluate the degree of protection it can provide to browsers
versus semantic distortions it introduces. Members of
my research group are interested in collaborating on this
project.
- Securing "mediated" trace analysis: a major problem in network
security research is obtaining access to realistic traffic traces.
One paradigm for enabling such access is via "mediation", i.e.,
the researcher sends their analysis program to a data-holder, who
runs it on behalf of the researcher and returns the results.
How can we secure this process so that the data-holder can be
confident that the results do not leak sensitive information?
This would be a continuation of a research project my
group has pursued, which grew out of a previous class project.
There's a HotNets 2009 paper outlining how far we got.
- Construct a web "backtrace" to determine in a traffic trace when
a user arrives at a malicious URL, how they got there. Some of
this is straight-forward (recording of Referer chains); where it
may get more difficult is stealthier redirection mechanisms.
- Explore the "traffic delivery business" where you can purchase
"eyeballs" to visit your Web site. Do the sellers of this
service actually deliver on increased visits? Where do the
visitors come from? Are they humans or bots? How did the
seller spur them to visit? This project has already been
partially undertaken by a student at UCSD.
- Build a detector for traffic injection (e.g., DNS or ARP spoofing)
and run it as widely as you can. What do you find? This project
contains some risk, namely you may wind up with a completely
negative result - no evidence of injection.
- A significant problem in network security monitoring is grappling with
the large number of application protocols. Simply understanding
their workings is currently a lengthy manual process. Employ
forms of network protocol inference or binary execution analysis
to automate elements of extracting the workings of unknown
protocols. If done using network monitoring, this project
would be in conjunction with members of my research group. If
done using binary execution analysis, then with Prof. Dawn Song's group.