Lecture 9: Web security

Overview

Requests:

request is composed of header and optional body separated by CRLF
header contains method, resource, protocol version, other info
body is considered as byte stream

Resource can be specified by absolute URI or absolute path. In HTTP/1.1, Host field is required to specify server to receive request. HTTP/2 lets server to push content (respond with data for more queries than client requested).

Replies:

composed of header and body separated by CRLF
header contains protocol version, status code, diagnostic text, other info
body is a byte stream

Header fields:

general: refer to message (date, pragma, cache-control, transfer-encoding, etc.)
request: accept (MIME type), host, authorization, from, if-modified-since, user-agent, referer, etc.
response: location, server, www-authenticate
entity: allow (methods that can be invoked), content-encoding, content-length (required if body not null), content-type (MIME type of body), expires, last-modified

URIs

syntax: {scheme}://{authority}{path}?{query}
can be absolute or relative

Authentication

simple challenge-response theme
challenge returned by server as part of 401 reply, specifies auth schema to be used
auth request refers to realm (set of resources on server)
client must include Authorization header field with required valid credentials
examples:
- Basic HTTP auth:
  - server replies to unauthorized request wit 401 message containing header field: WWW-Authenticate: Basic realm="ReservedDocs"
  - client retries access including in header a field containing cookie composed of base64 encoded username and password: Authorization: Basic <token>
- HTTP1.1 auth:
  - defines additional auth scheme based on cryptographic digests
    - server sends nonce as challenge
    - client sends request with digest of username, password, given nonce value, HTTP method, and request URL
  - to authenticate users, web server must have access to plaintext user passwords
- WebAuthn

Maintaining State

HTTP is stateless, but many apps require maintaining state across requests
ways:
- embedding information in URLs: app embeds user-specific info in every link contained in page returned to user
- putting information in forms: use hidden input tags, contains names/values
- cookies: set by server by including Set-Cookie header, cookies are passed in every further transaction with the site, accessible only by site that set them
- sessions: implemented at app level, at the beginning of session a unique ID is generated for the user, and is used to index info stored on server side

Server side

Common Gateway Interface:

mechanism to invoke programs on server side
program’s output returned to client
input parameters passed using URL in GET, or using body in POST
programs can be written in any language
input to program passed to process’ stdin
parameters passed by setting environment variables (REQUEST_METHOD, PATH_INFO, QUERY_STRING, CONTENT_TYPE, etc.)

Active Server Pages (ASP, ASP.NET)

Microsoft’s version of CGI scripts
pages containing mix of text, HTML tags, scripting directives (VBScript, JScript), server-side includes
directives executed on server side before serving page

Servlets, JavaServer pages

Servlets: Java programs executed on server, similar to CGI programs, can be executed within existing JVM without making new process
JavaServer pages (JSP): static HTML intermixed with Java code, similar to ASP, compiled into servlets

PHP:

scripting language embedded in HTML pages
PHP code executed on server side when page containing code request
common setup is LAMP (Linux+Apache+MySQL+PHP)

Web Application Frameworks

provide support for fast development of web apps
might be based on existing web servers, or have a new environment
often based on Model-View-Controller architecture
provide automatic translation of objects from/to database
provide templates for generating dynamic pages

Client side

Java applets:

compiled Java programs downloaded into browser and executed in context of web page
access to resources regulated by implementation of Java Security Manager
dead

ActiveX controls

binary, native code downloaded and executed in context of page
only supported by Windows-based browsers
code signed using Authenticode mechanism
once executed, complete access to client’s environment
dead

JavaScript/JScript, EcmaScript/VBScript

scripting languages for dynamic behavior in web pages
JS initially introduced by NetScape, JScript is Microsoft’s version
EcmaScript standardised version of JS
VBScript is based on Microsoft Visual Basic

asm.js:

subset of JS allowing for fast code
can use special compiler passes to e.g. translate C to asm.js

webassembly

low-level bytecode for in-browser client-side scripting initial aim to support compilation from C and C++
initial implementation support in browsers based on feature set of asm.js

Code is embedded into HTML pages using script tag. Window is top of hierarchy of objects. DOM (Document object model) lets your script manipulate content. BOM (Browser object model) is interface to browser’s properties.

JS security policies:

same origin: JS code can only access resources (e.g. cookies) associated with same origin/host
- every frame in browser’s window is associated with domain (origin = URI scheme, hostname, port number)
- web browser only lets scripts contained in web page A to access data in web page B if they have the same origin
- even iframes/included scripts execute within frame domain
signed script: signature on JS code is verified and principal identity extracted; principal identity compared to policy file to determine level of access
configurable: user can manually modify policy file to allow/deny access to specific :/methods for code from specific sites

Site isolation

site-dedicated processes, with browser process as interface
cross-origin read blocking: stop access to specific types of data

AJAX (asynchronous JavaScript and XML)

way to modify page based on result of request, without need of explicit user action
relies on JS-based DOM manipulation, and XML-HTTP request object
using onreadystatechange property of XML-HTTP request, set callback for result of query

Possible attacks:

bug in renderer process
universal cross-site scripting bugs let attacker bypass same origin policy
side channel attacks like Spectre/RIDL let attacker read arbitrary renderer process memory

Computer and Network Security

Table of Contents

Lecture 9: Web security

Overview

Server side

Client side