1、Standard MIDI-File Format Spec. 1.1Distributed by:The International MIDIAssociation5316 W. 57th St.Los Angeles, CA 90056(213) 649-64340 - IntroductionThe document outlines the specification for MIDI Files. The purposeof MIDI Files is to provide a way of interchanging time-stamped MIDIdata between di
2、fferent programs on the same or different computers.One of the primary design goals is compact representation, which makesit very appropriate for disk-based file format, but which might makeit inappropriate for storing in memory for quick access by asequencer program. (It can be easily converted to
3、a quickly-accessible format on the fly as files are read in or written out.)It is not intended to replace the normal file format of any program,though it could be used for this purpose if desired.MIDI Files contain one or more MIDI streams, with time information foreach event. Song , sequence, and t
4、rack structures, tempo and timesignature information, are all supported. Track names andother descriptive information may be stored with the MIDI data. Thisformat supports multiple tracks and multiple sequences so that ifthe user of a program which supports multiple tracks intends to movea file to a
5、nother one, this format can allow that to happen.This spec defines the 8-bit binary data stream used in the file. Thedata can be stored in a binary file, nibbilized, 7-bit-ized forefficient MIDI transmission, converted to Hex ASCII, or translatedsymbolically to a printable text file. This spec addre
6、sses whatsin the 8- bit stream. It does not address how a MIDI File will betransmitted over MIDI. It is the general feeling that a MIDItransmission protocol will be developed for files in general and MIDIFiles will use this scheme.1 - Sequences, Tracks, Chunks: File Block StructureCONVENTIONSIn this
7、 document, bit 0 means the least significant bit of a byte, andbit 7 is the most significant.Some numbers in MIDI Files are represented is a form calledVARIABLE-LENGTH QUANTITY. These numbers are represented 7 bits perbyte, most significant bits first. All bytes except the last have bit7 set, and th
8、e last byte has bit 7 clear. If the number is between0 and 127, it is thus represented exactly as one byte.Here are some examples of numbers represented as variable-length quantities:00000000 0000000040 400000007F 7F00000080 81 0000002000 C0 0000003FFF FF 7F00004000 81 80 0000100000 C0 80 00001FFFFF
9、 FF FF 7F00200000 81 80 80 0008000000 C0 80 80 000FFFFFFF FF FF FF 7FThe largest number which is allowed is 0FFFFFFF so that thevariable-length representations must fit in 32 bits in a routine towrite variable-length numbers. Theoretically, larger numbers arepossible, but 2 x 108 96ths of a beat at
10、a fast tempo of 500 beatsper minute is four days, long enough for any delta-time!FILESTo any file system, a MIDI File is simply a series of 8-bit bytes.On the Macintosh, this byte stream is stored in the data fork of afile (with file type MIDI), or on the Clipboard (with data typeMIDI). Most other c
11、omputers store 8-bit byte streams in files - namingor storage conventions for those computers will be defined as required.CHUNKSMIDI Files are made up of -chunks-. Each chunk has a 4-character typeand a 32- bit length, which is the number of bytes in the chunk. Thisstructure allows future chunk type
12、s to be designed which may be easilybe ignored if encountered by a program written before teh chunk typeis introduced. Your programs should EXPECT alien chunks and treatthem as if they werent there.Each chunk begins with a 4-character ASCII type. It is followed by a32-bit length , most significant b
13、yte first (a length of 6 is storedas 00 00 00 06). This length refers to the number of bytes of datawhich follow: the eight bytes of type and length are not included.Therefore, a chunk with a length of 6 would actually occupy 14 bytesin the disk file.This chunk architecture is similar to that used b
14、y Electronic ArtsIFF format, and the chunks described herin could easily be placed inan IFF file. The MIDI File itself is not an IFF file: it containsno nested chunks, and chunks are not constrained to be an even numberof bytes long. Converting it to an IFF file is as easy as padding oddlength chunk
15、s, and sticking the whole thing inside a FORM chunk.MIDI Files contain two types of chunks: header chunks and track chunks.A -header - chunk provides a minimal amount of information pertainingto the entire MIDI file. A -track- chunk contains a sequential streamof MIDI data which may contain informat
16、ion for up to 16 MIDI channels.The concepts of multiple tracks, multiple MIDI outputs, patterns,sequences, and songs may all be implemented using several track chunks.A MIDI File always starts with a header chunk, and is followed by oneor more track chunks.MThd MTrk MTrk . . .2 - Chunk DescriptionsH
17、EADER CHUNKSThe header chunk at the beginning of the file specifies somebasic information about the data in the file. Heres the syntax ofthe complete chunk:= As described above, is the four ASCII characters MThd;is a 32-bit representation of the number 6 (high byte first).The data section contains t
18、hree 16-bit words, stored most- significantbyte first.The first word, , specifies the overall organization of thefile.Only three values of are specified:0-the file contains a single multi-channel track1-the file contains one or more simultanious tracks (or MIDI outputs)of a sequence2- the file conta
19、ins one or more sequentially independantsingle-track patternsMore information about these formats is provided below.The next word, , is the number of track chunks in the file. Itwill always be 1 for a format 0 file.The third word, , specifies the meaning of the delta-times.It has two formats, one fo
20、r metrical time, and one for time-code-basedtime:+-+-+| 0 | ticks per quarter-note |=| 1 | negative SMPTE format | ticks per frame |+-+-+-+|15 | 14 8 |7 0 |If bit 15 of is zero, the bits 14 thru 0 represent the numberof delta time “ticks“ which make up a quarter-note. For instance, ifdivision is 96,
21、 then a time interval of an eighth-note between twoevents in the file would be 48.If bit 15 of is a one, delta times in a file correspondto subdivisions of a second, in a way consistent with SMPTE and MIDITime Code. Bits 14 thru 8 contain one of the four values -24, -25, -29,or -30, corresponding to
22、 the four standard SMPTE and MIDI Time Codeformats (-29 corresponds to 30 drop frome), and represents thenumber of frames per second. These negative numbers are stored intwos compliment form. The second byte (stored positive) is theresolution within a frame: typical values may be 4 (MIDI Time Codere
23、solution), 8, 10, 80 (bit resolution), or 100. This stream allowsexact specifications of time-code-based tracks, but also allowsmilisecond-based tracks by specifying 25|frames/sec and aresolution of 40 units per frame. If the events in a file are storedwith a bit resolution of thirty- framel time co
24、de, the division wordwould be E250 hex.FORMATS 0, 1, AND 2A Format 0 file has a header chunk followed by one track chunk. Itis the most interchangable representation of data. It is very usefulfor a simple single-track player in a program which needs to makesynthesizers make sounds, but which is prim
25、arily concerened withsomething else such as mixers or sound effect boxes. It is verydesirable to be able to produce such a format, even if your programis track-based, in order to work with these simple programs. On theother hand, perhaps someone will write a format conversion fromformat 1 to format
26、0 which might be so easy to use in some settingthat it would save you the trouble of putting it into your program.A Format 1 or 2 file has a header chunk followed by one or moretrack chunks. programs which support several simultanious tracksshould be able to save and read data in format 1, a vertica
27、llyone- dementional form, that is, as a collection of tracks. Programswhich support several independent patterns should be able to save andread data in format 2, a horizontally one- dementional form.Providing these minimum capabilities will ensure maximuminterchangability.In a MIDI system with a com
28、puter and a SMPTE synchronizer which usesSong Pointer and Timing Clock, tempo maps (which describe the tempothroughout the track, and may also include time signature information,so that the bar number may be derived) are generally created on thecomputer. To use them with the synchronizer, it is nece
29、ssary totransfer them from the computer. To make it easy for the synchronizerto extract this data from a MIDI File, tempo information should alwaysbe stored in the first MTrk chunk. For a format 0 file, the tempo willbe scattered through the track and the tempo map reader should ignorethe intervenin
30、g events; for a format 1 file, the tempo map must bestored as the first track. It is polite to a tempo map reader to offerryour user the ability to make a format 0 file with just the tempo,unless you can use format 1.All MIDI Files should specify tempo and time signature. If they donnt,the time sign
31、ature is assumed to be 4/4, and the tempo 120 beats perminute. In format 0, these meta-events should occur at least at thebeginning of the single multi-channel track. In format 1 , thesemeta-events should be contained i| the first track. In format2, each of the temporally independant patterns should
32、 contain atleast initial time signature and tempo information.We may decide to define other format IDs to support other structures.A program encountering an unknown format ID may still read other MTrkchunks it finds from the file, as format 1 or 2, if its user can makesense of them and arrange them
33、into some other structure if appropriate.Also , more parameters may be added to the MThd chunk in the future:it is important to read and honor the length, even if it is longer than6.TRACK CHUNKSThe track chunks (type MTrk) are where actual song data is stored.Each track chunk is simply a stream of M
34、IDI events (and non-MIDIevents), preceded by delta-time values. The format for TrackChunks (described below) is exactly the same for all three formats(0, 1, and 2: see “Header Chunk“ above) of MIDI Files.Here is the syntax of an MTrk chunk (the + means “one or more“: atleast one MTrk event must be p
35、resent):= +The syntax of an MTrk event is very simple:= is stored as a variable-length quantity. It representsthe amount of time before the following event. If the first event ina track occurs at the very beginning of a track, or if twoevents occur simultaineously, a delta-time of zero is used.Delta
36、- times are always present. (Not storing delta-times of 0requires at least two bytes for any other value, and mostdelta-times arent zero.) Delta- time is in some fraction of a beat(or a second, for recording a track with SMPTE times), as specifiedin the header chunk.= | | is any MIDI channel message
37、. Running status is used:status bytes of MIDI channel messages may be omitted if the precedingevent is a MIDI channel message with the same status. The first eventin each MTrk chunk must specifyy status. Delta-time is notconsidered an event itself: it is an integral part of the syntax foran MTrk eve
38、nt. Notice that running status occurs across delta-times.is used to specify a MIDI system exclusive message, eitheras one unit or in packets, or as an “escape“ to specify any arbitrarybytes to be transmitted. A normal complete system exclusive messageis stored in a MIDI File in this way:F0 The lengt
39、h is stored as a variable-length quantity. It specifies thenumber of bytes which follow it, not including the F0 or the lengthitself. For instance , the transmitted message F0 43 12 00 07 F7 wouldbe stored in a MIDI File as F0 05 43 12 00 07 F7. It is required toinclude the F7 at the end so that the
40、 reader of the MIDI File knowsthat it has read the entire message.Another form of sysex event is provided which does not imply that anF0 should be transmitted. This may be used as an “escape“ to providefor the transmission of things which would not otherwise be legal,including system realtime messag
41、es, song pointer or select, MIDI TimeCode, etc. This uses the F7 code:F7 Unfortunately , some synthesizer manufacturers specify that theirsystem exclusive messages are to be transmitted as little packets. Eachpacket is only part of an entire syntactical system exclusive message,but the times they ar
42、e transmitted are important. Examples of thisare the bytes sent in a CZ patch dump, or the FB-01s “system exclusivemode“ in which microtonal data can be transmitted. The F0 and F7 sysexevents may be used together to break up syntactically completesystem exclusive messages into timed packets.An F0 sy
43、sex event is used for the first packet in a series - itis a message in which the F0 should be transmitted. An F7 sysex eventis used for the remainder of the packets, which do not begin with F0.(Of course, the F7 is not considered part of the system exclusivemessage).A syntactic system exclusive mess
44、age must always end with an F7, evenif the real-life device didnt send one, so that you know when youvereached the end of an entire sysex message without looking ahead tothe next event in the MIDI File. If its stored in one compllete F0sysex event, the last byte must be an F7. There also must not be
45、 anytransmittable MIDI events in between the packets of a multi-packetsystem exclusive message. This principle is illustrated in theparagraph below.Here is a MIDI File of a multi-packet system exclusive message: supposethe bytes F0 43 12 00 were to be sent, followed by a 200-tick delay,followed by t
46、he bytes 43 12 00 43 12 00, followed by a 100-tick delay,followed by the bytes 43 12 00 F7, this would be in the MIDI File:F0 03 43 12 0081 48 200-tick delta timeF7 06 43 12 00 43 12 0064 100-tick delta timeF7 04 43 12 00 F7When reading a MIDI File, and an F7 sysex event is encountered withouta prec
47、eding F0 sysex event to start a multi-packet system exclusivemessage sequence, it should be presumed that the F7 event is beingused as an “escape“. In this case, it is not necessary that it endwith an F7 , unless it is desired that the F7 be transmitted.specifies non-MIDI information useful to this
48、formator to sequencers, with this syntax:FF All meta-events begin with FF, then have an event type byte (whichis always less than 128), and then have the length of the data storedas a variable-length quantity, and then the data itself. If there isno data, the length is 0. As with chunks, future meta
49、-events may bedesigned which may not be known to existing programs, so programsmust properly ignore meta-events which they do not recognize, andindeed should expect to see them. Programs must never ignore thelength of a meta-event which they do not recognize, and theyshouldnt be surprized if its bigger th