1、活动标题 宋词 灵感来源(Inspiration source of the title of song lyrics)Exports will envy the poetry of the literati? Now you dont have to envy them anymore! As a netizen “Yixuan“ by the “full song“ out “fiddle“, calculated the 99 high-frequency words which. Memorize these high frequency words, and you can crea
2、te with the sex! You can also use “disorder“ numbers to create a song “wonderful“ song, PI can also oh!Simplified Song poetry is “where is the east wind?“The netizen named “Yixuan“ wrote in his personal blog: “what kind of image is the most common thing that you want to see in the Song Dynasty, for
3、example, you can do a frequency analysis.“. Of course, text mining requires word segmentation, and I cant spend too much time on it, so Ive come up with a solution. The sentences of Song Ci are very short, and if the possible combination of the words is not too much, and the most common words are us
4、ually two or three words, so the possible combination is less.“ For example, “and solutions to marry wind“ this sentence, the word is a combination of two possible solutions of Judah “ marriage “and“ married East “and“ East “, the three word is a combination of“ solution “solution to marry Judah mar
5、ried“ East “and“ marry wind “, the more words words, the less possible combinations. If you list all the possible words in each sentence, you can count the frequency as a whole.“ We all agreed that this friend must be a science student.Then, “Yixuan“ posted the high-frequency words he counted out:1,
6、 14852, Dongfeng (1382)3, where (1230)4, the human world (1202)5, romantic (857)6, return (812)7, spring breeze (802)8, west wind (779)9, return (771)10, Jiangnan (765)As for why the first is a number, he explains: “the first row is invalid character, which is related to the data source.“The results
7、 came out, a netizen gave away the “mystery“, “originally, the most popular song is“ where the wind in the world “!“Birthday, cell phone number random combination is very beautifulWhats more?! Signed “Da Vincis egg“ shell users somehow think everyone will get back a little PI creation, two numbers o
8、ff the high frequency control list, a “magnificent“ word came out! Also attached, especially with an air of importance.Da Vincis egg nutshellQing Pingle, PILooking back at the moon (Lyric Poetry)1415Leisurely empty mind (indeed, seemingly lovelorn)9265How lonely in West Lake (moved by sight)358979Th
9、e wind blow the rush (recall that afternoon affair)323846The sun (the most beautiful life life seen the sun and grass, is because there is your shadow)264338Lonely wind today (now to myself)327950A year of wealth and prosperity (good flowers, beautiful years, good scenery, not suitable for people)28
10、8419The Changan not (I miss you know?)716939Dream - root twoThe depths of season Trinidad (after many years, the male pig came to a very distant place)414213News of the year (mandarin duck) Find old go?)562373Come back today (sister, Hu Hansan, Im back)0950Its a bit unkind. (you know, come and see m
11、e!)488016Tonight, tonight, theres a problem8888Now the season (return back to that year, the male pig is still so strong)724209After reading this poem, netizens worship, and their own creative inspiration has also been opened.“Qiu cold“ message way: “try to use a number to create a: Acacia in the sk
12、y, fragrant grass every year, last night Jiangnan, looking back a smile passionate.“Zero Ronnie“ very creative passion: “my birthday: Mid River, south of the Yangtze River in spring; my mobile phone number: the moon last year, Jiangnan with back Its really catchy! I can be a poet too. Haha!“Science
13、students cheer: the days of eliminating liberal arts students comeBut in this way, liberal arts students are not calm, a group of liberal arts students jumped back.“Wen Ming Xia Xia“ shouted: “all dragged out, chopped.“! Why does it make Chinese language feel good?!“Rockfish“ on behalf of science st
14、udents counterattack: “science otaku to eliminate liberal arts students came.“! Pick up your calculator and wipe out the literary youth!“Cocoa“ childrens shoes call: “let science and technology small fresh more fierce!“!“There are a group of friends to discuss with all sorts of gossip.“This world“ c
15、almly said: “I do not know the song fans see is there an idol burst of feeling.“Jia Jieshi“ shoes also concluded: “the men and women have to stop home compose!“Netizen “Rakin“ comment: “science students really fierce art!“Yan Xin Spring“ call: “want to please turn back the!“ChanIm“ says: “when you l
16、earned to write poetry, you looked down on it, and now youre calm, because youre not so sentimental.“Yearning autumn“ emotion: “must pass“!“NetCharm“ said: “just memorize, combination, write out is not too bad.“Huahualipo“ shoes also put forward their views: “master, can also add pingze If you have
17、too difficult, at least consider ending rhyme? Rhyme it dozen, each is assigned a value, and then every other line, at the end of the word is only from a value chosen. This will greatly increase the simulation degree ah!“Again, the nerve short circuit, suddenly want to see what kind of image in the
18、Song Ci is the most common, for example, can do a frequency analysis of what. Of course, text mining requires word segmentation, and I cant spend too much time on it, so Ive come up with a solution. The sentences of Song Ci are very short, and if the possible combination of the words is not too much
19、, and the most common words are usually two or three words, so the possible combination is less. For example, “and solutions to marry wind“ this sentence, the word is a combination of two possible solutions of Judah “ marriage “and“ married East “and“ East “, the three word is a combination of“ solu
20、tion “solution to marry Judah married“ East “and“ marry wind “, the more words words, the less possible combinations. If you list all the possible words in each sentence, you can count the frequency as a whole.Of course, there will be a combination of a lot of meaningless words, but this kind of “wo
21、rds“ itself is a coincidence, so it can be expected that their overall frequency is very low, is not “in good taste“. Dont say much, go directly to the code and the results.Data: the text of Quan Song CiCode:L=scan (“Ci.txt“, “character“, “sep=“, “n“);L.len=nchar (L);For some # is the author and tit
22、le, so select the length of more than 10 lines;In addition # this text file is not regular, some what,# so it should exclude those length too long.Ci=l;# punctuation sentence segmentation.Sentences=strsplit (CI, | “. |! |? |), “;Sentences=unlist (sentences);Sentences=sentencessentences = = “ “;S.len
23、=nchar (sentences);# sentence is too long that there may be the wrong character, get rid of.Sentences=sentencess.len=10;S.len=nchar (sentences);One by one # violent resolution, such as “all solutions marry wind“ in all two word combination# still solution “ marriage “and“ married “Dongfeng East“,# m
24、eaningless words of its natural frequency fell behind.Splitwords=function (x, x.len) substring (x, 1: (x.len-1), 2:x.len);Words=mapply (splitwords, sentences, s.len, SIMPLIFY=TRUE, USE.NAMES=FALSE);Words=unlist (words);Words.freq=table (words);Words.freq=sort (words.freq),减少= TRUE) ;数据帧(字=名称(字。频率为)
25、,或为整数(字。频率为) ) ;结果(排在第一的是无效字符,这跟数据源有关):字频率1 1485 2 1382 3 1230 4东风何处人间 12025风流 857 6 812 7 802 8归去春风西风 7799归来 771 10 765 11 753 12江南相思梅花 73213千里 676 14 656 15 651 16回首明月多少 64817如今 642 18 630 19 613 20阑干年年万里 59021一笑 582 22 550 23 542 24黄昏当年天涯 53725相逢 528 26 527 27 516 28芳草尊前一枝 51229风雨 505 30 472 31 4
26、72 32流水依旧风吹 47133风月 461 34 457 35 451 36多情故人当时 45037无人 445 38 438 39 430 40斜阳不知不见 42941深处 422 42 403 43 398 44时节平生凄凉 39845春色 394 46 383 47 383 48匆匆功名一点 37849无限 377 50 369 51 368 52今日天上杨柳 36253西湖 356 54 354 55 353 56桃花扁舟消息 35157憔悴 344 58 339 59 338 60何事芙蓉神仙 33461一片 334 62 333 63 332 64桃李人生十分 33165心事
27、329 66 328 67 325 68黄花一声佳人 32469长安 321 70 319 71 316 72东君断肠而今 31573鸳鸯 314 74 313 75 310 76为谁十年去年 30977少年 308 78 307 79 306 80海棠寂寞无情 30681不是 305 82 304 83 303 84时候肠断富贵 30385蓬莱 303 86 303 87 302 88昨夜行人今夜 30189谁知 300 90 299 91 298 92不似江上悠悠 29693几度 295 94 295 95 294 96青山何时天气 29397惟有 293 98 291 99 291 10
28、0一曲月明往事 290不知各位看官看到上面这些既熟悉又悠远的话语又将作何感想?或许,她们就是我们千百年来的精神寄托吧。试了下在 R下面执行此代码,不过在我机子上好像有问题。于是按照这个思路用 KNIME做了个简单统计流程。 (数据源相同)因为多处理了一点异常,双字词频顺序基本一样,数量稍有区别,不重复贴了。不过可以贴出短句句频:D“到如今” ,50“,46”谁知道君知否” “,30” ,28 功名事”“须信道” ,28“,27”人间世最好是” “,26” ,26 从今去”“凝伫” ,25“,24”不如归去归去” “,23” ,23 知否”“谁信道” ,23“,21”到而今倚阑干” “,21”
29、,21 又还是”“归去来兮” ,21“,20”当此际人不见” “,20” ,19 记当年”“东风里” ,18“,18”春去也怎奈向” “,18” ,18 须知道”“争知道” ,17“,17”留不住更那堪” “,17” ,17 谩赢得”“那堪更” ,17“,16”休休一觞一咏” “,16” ,16 君不见”“家山好” ,16“,16”思往事归来也” “,16” ,16 悠悠”“No end“, 16 “, 16“ know “chasing the past“, 16 “, 15 Heaven On Earth“The most bitter is“ 14, “shuyinghengxie“,
30、14 “, 14“ empty I see “empty melancholy“, 14“Remember the year“, 14 “human affairs“, “13“ and “only fear“, 13 “looking back“, 13“Night Shen Shen“, 13 “broken human bowel“, 13 “early return“, “13“, “how much“, 13The “empty“ to stand expectantly “, 13, 12“ respect “before the rain“, “12 love tendernes
31、s“, 12“The sun“, “12 Speechless“, 12 “, 12“ Moonlight “looks green hair“, 12“Young girl“, 12 “who read me“, 12 “, 12“ also know what to ask you “, 12“11“ Dongfeng, cannot bear to think of the past “evil“, 11 “, 11“ where “is“ 11.“The old“, “from the other 11“, 11 “, 11“ leaning East “and why“, 11“Ho
32、w many things“, 11 “everlasting“, 11 “Anyang good“, 11 “against the east wind“, 11“Against the west wind“, 11 “wide cold palace“, 11 “return“, “11“, “return late“, 11“Wish every year“, 11 “the South Bank of the river“, 11 “empty looking back“, “11“ “not like“, 11“Pain“, 11 “, 11“ sigh “,“ West 11 fl
33、ower catkins “source“, 11“With“ 11 “romantic“, 11 “, 11“ whiz “and who“, 10“Wu Yun deep“, “10 world where unforgettable wine“, “10 person“, “from 10 to 10.“Western“ 10 “is clearly“, 10 “, 10“ Nan Xu rank, success, fame and riches “good“, 10“Year“, 10 “, 10“ clear thinking “to the“ 10 “unlimited“, 10
34、“Every morning and evening,“ 10 “,“ 10 song slim “alone“, 10 “, 10 bamboo fences and hay-thatched mud cottages“With calm posture“, “but“ 10, 10 “, 10“ the record was “10“.“Drunk back“, 10 “seventy, 9“ people rarely “jade“, “9 people“, 9“Where“, “9 gaze“, 9 “eternal hate“, “9 years of age“, 9“The las
35、t day“, 9 “, 9“ to the “take off“, “9 talents and 9“.“9“ year after year today, nice day and beautiful night “,“ 9 “to return“, 9 “sorrow“ 9“Where“ friends, “9, 9“ at leisure “West“, 9 “Pipa“, 9“Muddy not like“, 9 “clean“, 9 “Yingying“, 9 “jejunum broken“, 9“Empty wins“, “9 calculations“, 9 “count o
36、nly“, 9 “predestined relationship“, 9“Remember the day“, 9 “, 9“ and “9“ are paid to demeanour of a transcendent being, “9“.“Hugh asked“, 9 “, 9“ sober “ask the world“, “when asked,“ 9 9“Wind is uncertain“, 9 “sound“, “8“, “8“, “twenty years“, 8“After the people scattered“, 8 “people easy to old“, 8
37、 “from the future“, “8“ to go, 8“Rhetorica drunk“, 8 “, 8“ is still “when Hugh“, 8 “Pinglan long“, 8“High altitude“, 8 “and 8“ monarch who knows “, and“ 8, “I am old, 8“Looking back“, “8“ “envy“, 8 “how much hate“, 8 “night wind and rain“, 8“The whole world“, “8 day water“, 8 “to“ 8, “smile“, 8“Lone
38、liness“, “8 mountains“, “8 return“, “8 things under the heart“, 8“Know how“, 8 “, 8“ you think long “season“, 8 “, 8 was no“May the year“ 8 “the Prefecture“, 8 “no matter“, “the 8“, 8The “8 benefits“, “who knows,“ 8 “,“ 8 world “abroad“, 8“8“ painted hall, elegant demeanor and high personality “deep“, “8 board“, 8 “, 8“True is“, 8 “know where“, 8 “bosom friend less“, 8 “Shou Shou“, 8“Yi“, 8 “, 8“ human laughter “screen“, 8 “, 8 Petals drop and waters flow.“Changan“, 8 “, 8“ when asked “period of rain“, “8 frequency back“, 8“Wind and rain“, 8 “go with the wind“,