1、HTTP/1.1 协议规范(中文归纳版) 一、介绍(introduction) 1. 目的 HTTP/0.9-HTTP/1.0-HTTP/1.1 2. 要求 MUST、REQUIRED、SHOULD 3. 术语 连接 (Connection)、消息(Message) 、请求(Request) 、应答(Response)、资源(Resource)、实体 (Entity)、表示方法(Representation)、内容协商(Content Negotiation)、变量(Variant) 、客户机( Client)、用户代理(User agent)、服务器(Server) 、原服务器(Origin
2、 server)、代理服务器( Proxy)、网关(gateway)、高速缓存(Cache)、可缓存( Cacheable)、直接(first-hand)、明确终止时间(explicit expiration time)、探索终止时间(heuristic expiration time)、年龄(Age)、保鲜寿命( Freshness lifetime)、保鲜(Fresh)、陈旧(Stale)、语义透明(semantically transparent)、有效性判别器(Validator)、实体标记(entity tag)或最终更改时间(Last-Modified time)、上游/ 下游(u
3、pstream/downstream)、向内/向外(inbound/outbound) 4. 总体操作 请求/ 应答、中介 二、符号惯例与一般语法(notational conversions and generic grammar) 1. 扩充 BNFname = definition,“literal“,rule1 | rule2,(rule1 rule2),*rule,rule,N rule, #rule,; comment, implied *LWS 2. 基本规则 OCTET,CHAR,UPALPHA,LOALPHA,ALPHA,DIGIT,CTL,CR,LF,SP,HT, 三、协议
4、参数(protocol parameters) 1. HTTP 版本HTTP-Version = “HTTP“ “/“ 1*DIGIT “.“ 1*DIGIT 2. 统一资源标示符(URI )统一资源定位器(URL)和统一资源名称(URN)的结合,http_URL = “http:“ “/“ host “:“ port abs_path “?“ query 3. 日期 /时间格式 Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123,Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsol
5、eted by RFC 1036,Sun Nov 6 08:49:37 1994 ; ANSI Cs asctime() format 4. 字符集 本文档中的术语“ 字符集“指一种用一个或更多表格将一个八字节序列转换成一个字符序列的方法,charset=token失踪字符集 5. 内容编码 内容编码主要用来允许文档压缩(信源编码)content-coding= token注册表包含下列标记:gzip,compress ,deflate ,identity 6. 传输编码 目的是能够确保通过网络安全传输(信道编码)transfer-coding = “chunked“ | transfer-e
6、xtensiontransfer-extension = token *( “;“ parameter ),成块传输代码 7. 媒体类型 media-type = type “/“ subtype *( “;“ parameter )type = tokensubtype = token规范化和原文缺省多部分类型 8. 产品标记 product = token “/“ product-versionproduct-version = token 9. 质量值 qvalue = ( “0“ “.“ 0*3DIGIT )| ( “1“ “.“ 0*3(“0“) ) 10. 语言标记language
7、-tag = primary-tag *( “-“ subtag )primary-tag = 1*8ALPHAsubtag = 1*8ALPHA 11. 实体标记entity-tag = weak opaque-tagweak = “W/“opaque-tag = quoted-string 12. 范围单位range-unit = bytes-unit | other-range-unitbytes-unit = “bytes“other-range-unit = token 四、 HTTP 消息(HTTP message) 1. 消息类型 HTTP-message = Request |
8、 Response ; HTTP/1.1 messagesgeneric-message = start-line *(message-header CRLF) CRLF message-body start-line = Request-Line | Status-Line 2. 消息头 HTTP 头域包括常规头 ,请求头,应答头和实体头域message-header = field-name “:“ field-value field-name = tokenfield-value = *( field-content | LWS )field-content = 3. 消息体 messa
9、ge-body = entity-body| 4. 消息的长度决定因素 5. 常规头域 general-header = Cache-Control| Connection| Date| Pragma| Transfer-Encoding 五、 请求(request) 首行包括利用资源的方式,区分资源的标识, 以及协议的版本号Request = Request-Line * ( general-header| request-header| entity-header ) CRLF) CRLF message-body 1. 请求行 Request-Line = Method SP Reque
10、st-URI SP HTTP-Version CRLF方法方法标记指的是在请求 URI 所指定的资源上所实现的方式Method = “OPTIONS“| “GET“| “POST“| “PUT“| “DELETE“| “TRACE“| “CONNECT“| extension-methodextension-method = token请求 URL请求 URL 是一种全球统一的应用于资源请求的资源标识符Request-URI = “*“ | absoluteURI | abs_path | authority请求行举例:GET http:/www.w3.org/pub/WWW/TheProje
11、ct.html HTTP/1.1GET /pub/WWW/TheProject.html HTTP/1.1Host: www.w3.org 2. 请求定义的资源一个 INTERNET 请求所定义的精确资源由请求 URL 和主机报头域所决定 3. 请求报头域request-header = Accept| Accept-Charset| Accept-Encoding| Accept-Language| Authorization| Expect| From| Host| If-Match| If-Modified-Since| If-None-Match| If-Range| If-Unmod
12、ified-Since| Max-Forwards| Proxy-Authorization| Range| Referer| TE| User-Agent 六、 应答(response) 接收和翻译一个请求信息后,服务器发出一个 HTTP 应答信息Response = Status-Line*( general-header| response-header| entity-header ) CRLF) CRLF message-body 1. 状态行 Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF状态码状态码是
13、试图理解和满足请求的三位数字的整数码,1xx,2xx,3xx,4xx,5xx,100-505- 扩展码 2. 应答报头域response-header = Accept-Ranges| Age| Location| Proxy-Authenticate| Retry-After| Server| Vary| WWW-Authenticate 七、 实体(entity) 在未经特别规定的情况下,请求与应答的消息也可以传送实体。 实体包括实体报头域与实体正文,而有些应答只包括实体报头。 1. 实体报头域entity-header = Allow | Content-Encoding| Conten
14、t-Language| Content-Length | Content-Location| Content-MD5| Content-Range| Content-Type| Expires| Last-Modified| extension-headerextension-header = message-header 2. 实体正文 entity-body = *OCTETentity-body := Content-Encoding( Content-Type( data ) ) 八、 连接(connection) 1. 持续连接 优点持续连接是任何 HTTP 连接的缺省方式,支持持续
15、连接的客户机可以以流水线方式发送请求代理服务器 2. 消息传递要求持续连接与流量控制监视连接中出错状态的消息100 号状态的用途服务器过早关闭连接时客户机的动作 九、 方法定义(method definitions) 1. 安全和等幂方法安全方法GET 和 HEAD 方法除了补救外不应该有别的采取措施的含义等幂方法没有副作用的序列是等幂的 2. OPTIONSOPTIONS 方法代表在请求 URI 确定的请求/应答过程中通信条件是否可行的信息 3. GETGET 方法说明了重建信息的内容由请求 URI 来确定 4. HEAD除了应答中禁止返回消息正文外 ,HEAD 方法与 GET 方法一样 5
16、. POSTPOST 方法实现的实际功能取决于服务器 6. PUTPUT 方法要求所附实体存储在提供的请求 URI 下 7. DELETEDELELE 方法要求原服务器释放请求 URI 指向的资源 8. TRACETRACE 方法用于调用远程的应用层循环请求消息 9. CONNECTCONNECT 方法用于能动态建立起隧道的代理服务器 十、 状态码定义(status code definitions) 1. 信息 1XX100 继续101 转换协议 2. 成功 2XX200 请求成功201 创建202 接受203 非权威信息204 无内容205 重置内容206 局部内容 3. 重新定向 3XX
17、300 多样选择301 永久移动302 创立303 观察别的部分304 只读306(没有用的)307 临时重发 4. 客户错误 4xx400 坏请求401 未授权的402 必需的支付403 禁用404 没有找到405 不被允许的方法406 不接受407 代理服务器认证所必需408 请求超时409 冲突410 停止411 必需的长度412 预处理失败413 请求实体太大414 请求的 URI 过长415 不被支持的媒体类型416 请求范围不满足417 期望失败 5. 服务器错误 5xx500 服务器内部错误501 不能实现502 坏网关503 难以获得的服务504 网关超时505 HTTP 版本
18、不支持 十一、 访问验证(access authentication)可选择 十二、 内容谈判(content negotiation) HTTP 为了“内容谈判 “提供了一些机制,即当有很多种可能的表示时如何选择对于一个请求的最佳的表示。 1. 服务器驱动谈判一个请求的最佳表示的选择由服务器提供的运算法则来完成 2. 代理驱动谈判对于一个应答的最佳表示法的选择是在代理从原服务器端收到最初的应答后实现的 3. 透明谈判 透明的判断是服务器驱动和代理驱动谈判的结合体 十三、 HTTP 中的缓存(caching in HTTP) HTTP 典型应用于能通过采用缓存技术而提高性能的分布式信息系统 1
19、. 缓存 缓存正确性警告信息缓存控制机制直接的用户代理警告规则和警告的例外情况由客户控制的行为 2. 过期模型 服务器指定模型启发式过期年龄计算过期计算澄清过期值澄清多重响应 3. 确认模型 当缓存器想要用一个失时效的条目来相应客户的请求,他首先必须向源服务器检验这一缓存条目是否仍然可用最后修改日期标签缓存确认器强弱控制器关于何时使用实体标签和最后修改时间的规则不确认条件 4. 响应的缓存能力除非被明确限制,缓存系统可以将一成功的响应作为缓存实体一直存储 5. 从缓存构造响应端到端和 Hop-by-hop 报头不可更改报头联合报头联合字节范围 6. 缓存谈判响应 7. 共享与非共享缓存 8.
20、错误和不完全响应缓存行为 9. GET 和 HEAD 的副作用 10. 刷新或删除后的无效性 11. 强制写通过 12. 缓存替换 13. 历史纪录 十四、 报头域定义(header field definitions) 1. AcceptAccept = “Accept“ “:“ #( media-range accept-params )media-range = ( “*/*“| ( type “/“ “*“ )| ( type “/“ subtype ) *( “;“ parameter )accept-params = “;“ “q“ “=“ qvalue *( accept-ext
21、ension )accept-extension = “;“ token “=“ ( token | quoted-string ) 例 1:Accept: audio/*; q=0.2, audio/basic例 2:Accept: text/plain; q=0.5, text/html, text/x-dvi; q=0.8, text/x-c 2. Accept-CharsetAccept-Charset = “Accept-Charset“ “:“ 1#( ( charset | “*“ ) “;“ “q“ “=“ qvalue )例:Accept-Charset: iso-8859-
22、5, unicode-1-1;q=0.8 3. Accept-EncodingAccept-Encoding = “Accept-Encoding“ “:“ 1#( codings “;“ “q“ “=“ qvalue )codings = ( content-coding | “*“ )例:Accept-Encoding: gzip;q=1.0, identity; q=0.5, *;q=0 4. Accept-LanguageAccept-Language = “Accept-Language“ “:“ 1#( language-range “;“ “q“ “=“ qvalue )lang
23、uage-range = ( ( 1*8ALPHA *( “-“ 1*8ALPHA ) ) | “*“ )例:Accept-Language: da, en-gb;q=0.8, en;q=0.7 5. Accept-RangeAccept-Ranges = “Accept-Ranges“ “:“ acceptable-rangesacceptable-ranges = 1#range-unit | “none“例:Accept-Ranges: bytes 6. AgeAge = “Age“ “:“ age-valueage-value = delta-seconds 7. AllowAllow
24、 = “Allow“ “:“ #Method例:Allow: GET, HEAD, PUT 8. AuthorizationAuthorization = “Authorization“ “:“ credentials 9. Cache-ControlCache-Control = “Cache-Control“ “:“ 1#cache-directivecache-directive = cache-request-directive| cache-response-directivecache-request-directive =“no-cache“| “no-store“| “max-
25、age“ “=“ delta-seconds| “max-stale“ “=“ delta-seconds | “min-fresh“ “=“ delta-seconds| “no-transform“| “only-if-cached“| cache-extensioncache-response-directive =“public“| “private“ “=“ 1#field-name | “no-cache“ “=“ 1#field-name | “no-store“| “no-transform“| “must-revalidate“| “proxy-revalidate“| “m
26、ax-age“ “=“ delta-seconds| “s-maxage“ “=“ delta-seconds| cache-extensioncache-extension = token “=“ ( token | quoted-string ) 什么是可缓存的哪些可能被缓存保存对基本过期失效机制的改进缓存重新确认有效和重载控制不得转换的指令缓存控制扩展 10. ConnectionConnection = “Connection“ “:“ 1#(connection-token)connection-token = token例:Connection: close 11. Content
27、-EncodingContent-Encoding = “Content-Encoding“ “:“ 1#content-coding例:Content-Encoding: gzip 12. Content-LanguageContent-Language = “Content-Language“ “:“ 1#language-tag例:Content-Language: mi, en 13. Content-LengthContent-Length = “Content-Length“ “:“ 1*DIGITContent-Length: 3495 14. Content-LocationC
28、ontent-Location = “Content-Location“ “:“( absoluteURI | relativeURI ) 15. Content-MD5Content-MD5 = “Content-MD5“ “:“ md5-digestmd5-digest = 16. Content-RangeContent-Range = “Content-Range“ “:“ content-range-speccontent-range-spec = byte-content-range-specbyte-content-range-spec = bytes-unit SP byte-
29、range-resp-spec “/“( instance-length | “*“ )byte-range-resp-spec = (first-byte-pos “-“ last-byte-pos) | “*“instance-length = 1*DIGIT例:The first 500 bytes:bytes 0-499/1234 17. Content-TypeContent-Type = “Content-Type“ “:“ media-type例:Content-Type: text/html; charset=ISO-8859-4 18. DateDate = “Date“ “
30、:“ HTTP-date例:Date: Tue, 15 Nov 1994 08:12:31 GMT没有时钟的原服务器的运作 19. EtagETag = “ETag“ “:“ entity-tag例:ETag: W/“xyzzy“ 20. ExpectExpect = “Expect“ “:“ 1#expectationexpectation = “100-continue“ | expectation-extensionexpectation-extension = token “=“ ( token | quoted-string )*expect-params expect-params
31、 = “;“ token “=“ ( token | quoted-string ) 21. ExpiresExpires = “Expires“ “:“ HTTP-date例:Expires: Thu, 01 Dec 1994 16:00:00 GMT 22. FromFrom = “From“ “:“ mailbox例:From: webmasterw3.org 23. HostHost = “Host“ “:“ host “:“ port ; Section 3.2.2 24. If-MatchIf-Match = “If-Match“ “:“ ( “*“ | 1#entity-tag
32、)例:If-Match: “xyzzy“, “r2d2xxxx“, “c3piozzzz“ 25. If-Modified-SinceIf-Modified-Since = “If-Modified-Since“ “:“ HTTP-date例:If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT 26. If-None-Match If-None-Match = “If-None-Match“ “:“ ( “*“ | 1#entity-tag )例:If-None-Match: W/“xyzzy“, W/“r2d2xxxx“, W/“c3piozzz
33、z“ 27. If-Range If-Range = “If-Range“ “:“ ( entity-tag | HTTP-date ) 28. If-Unmodified-Since If-Unmodified-Since = “If-Unmodified-Since“ “:“ HTTP-date例:If-Unmodified-Since: Sat, 29 Oct 1994 19:43:31 GMT 29. Last-Modified Last-Modified = “Last-Modified“ “:“ HTTP-date例:Last-Modified: Tue, 15 Nov 1994
34、12:45:26 GMT 30. Location Location = “Location“ “:“ absoluteURILocation: http:/www.w3.org/pub/WWW/People.html 31. Max-Forwards Max-Forwards = “Max-Forwards“ “:“ 1*DIGIT 32. Pragma Pragma = “Pragma“ “:“ 1#pragma-directivepragma-directive = “no-cache“ | extension-pragmaextension-pragma = token “=“ ( t
35、oken | quoted-string ) 33. Proxy-Authenticate Proxy-Authenticate = “Proxy-Authenticate“ “:“ 1#challenge 34. Proxy-Authorization Proxy-Authorization = “Proxy-Authorization“ “:“ credentials 35. Range字节范围范围检索请求Range = “Range“ “:“ ranges-specifier 36. RefererReferer = “Referer“ “:“ ( absoluteURI | relat
36、iveURI ) 37. Retry-After Retry-After = “Retry-After“ “:“ ( HTTP-date | delta-seconds ) 38. Server Server = “Server“ “:“ 1*( product | comment ) 39. TE TE = “TE“ “:“ #( t-codings )t-codings = “trailers“ | ( transfer-extension accept-params )例:TE: trailers, deflate;q=0.5 40. Trailer Trailer = “Trailer
37、“ “:“ 1#field-name 41. Transfer-Encoding Transfer-Encoding = “Transfer-Encoding“ “:“ 1#transfer-coding例:Transfer-Encoding: chunked 42. UpgradeUpgrade = “Upgrade“ “:“ 1#product例:Upgrade: HTTP/2.0, SHTTP/1.3, IRC/6.9, RTA/x11 43. User-Agent User-Agent = “User-Agent“ “:“ 1*( product | comment )例:User-A
38、gent: CERN-LineMode/2.15 libwww/2.17b3 44. Vary Vary = “Vary“ “:“ ( “*“ | 1#field-name ) 45. Via Via = “Via“ “:“ 1#( received-protocol received-by comment )received-protocol = protocol-name “/“ protocol-versionprotocol-name = tokenprotocol-version = tokenreceived-by = ( host “:“ port ) | pseudonymps
39、eudonym = token例:Via: 1.0 ricky, 1.1 ethel, 1.1 fred, 1.0 lucy 46. Warning = “Warning“ “:“ 1#warning-valuewarning-value = warn-code SP warn-agent SP warn-text SP warn-datewarn-code = 3DIGITwarn-agent = ( host “:“ port ) | pseudonymwarn-text = quoted-stringwarn-date = HTTP-date 47. WWW-Authenticate WWW-Authenticate = “WWW-Authenticate“ “:“ 1#challenge 十五、 安全考虑(security considerations) 一些建议,但是并不包括最终解决方案 1. 个人信息服务器日志信息的滥用敏感信息的传输URI 中敏感信息的编码连接到 Accept 报头的机要问题 2. 基于文件和路径名称的攻击 3. DNS 欺骗 4. Location(位置)报头和欺骗 5. 内容倾向问题 6. 鉴定证书和空闲的客户机 7. 代理服务器和高速缓存对代理服务器的拒绝服务攻击