MP3下载器的设计与实现

客服QQ:3710167信息来源:不详 我要论文 收藏此文 【字体:

论文编号:TX254  论文字数:13503,页数:36 有开题报告,任务书,程序源码

摘 要
 搜索引擎,作为访问互联网的“网络门户”,是从www上快速而有效地获取信息资源的捷径。而网络爬虫作为搜索引擎的关键技术,它是一个自动提取,分析并过滤网页的程序,为搜索引擎从万维网上下载网页,是搜索引擎的重要组成。文件传输,作为网络应用中最主要的功能,也是互联网中资源共享的基础。下载工具也成为互联网中一种必不可少的工具。一些重要的协议像HTTP,FTP等都支持文件的传送,特别是基于P2P技术的,多任务,多线程,多源,断点续传的下载机制,极大的提高了网络资源的下载速度,最大化了网络资源的共享。
 论文首先介绍了课题涉及到的主要理论和技术,在详细分析了爬虫技术的原理和文件下载机制的基础上,针对本课题的应用,改进了爬虫算法。根据所改进的爬虫算法设计并实现了一个MP3下载器,该MP3下载器主要由网络爬虫程序和文件下载2部分组成。网络爬虫实现了在互联网上抓取MP3格式的音乐资源的URL链接及相关信息(歌曲名,艺术家,专辑名等),并将信息以XML形式的数据格式保存在本地,为以后查询下载提供基础。实现了基于HTTP协议的文件下载,并提供了断点续传机制和多任务下载以及文件自动重命名功能。然后,对该MP3下载器进行了测试,测试结果表明,MP3下载器在爬虫抓取MP3信息以及MP3下载上均取得了预期的效果。
 论文最后对全文进行了总结,并对今后工作作出了展望。
 
关键字:搜索引擎,网络爬虫,HTTP,P2P,断点续传
 Design and Implement of MP3 Download
Abstract
 Search engine, as a visit to the Internet "portal”, is a shortcut to rapid and effective access to the information resources from the www. Web crawler technology is the key to search engine, it is an automatic extraction, analysis and filtering website procedures for search engine downloaded the webpage from the World Wide Web. File transfer, as the most important network application functions, also is the basis of resources sharing on the Internet. Download tools has become an indispensable tool on the Internet. Some important protocols like HTTP, FTP and so on are major support as the supporting for the transmission of documents, particularly those based on P2P technology, multi-tasking, multi-threaded, multi-source and breakpoint continuingly download mechanism greatly improves the network download speed; maximize the sharing of network resources.
 This paper first introduces the main theory and technology which related to the
Theme, analyzes the principles of the web crawler and the mechanisms for downloading in deeply, improving the web crawler algorithm to satisfy with the application. To design and implement of an MP3 download, according to the improved algorithm of the web crawler,. The Web crawler on the Internet crawls MP3 link resources and related information (title, artist, album, etc.), and also stored the information in the forms of XML in local file, providing a basis for future inquiries and downloading. Implementing a download based on HTTP protocol and providing a mechanism for breakpoint continuingly, multi-tasking download and automatic rename the downloaded file. Then, having a test for the MP3 download; it shows that it achieved expected results.
 Finally, the researcher would show a review and outlook of the topics.

Key Words: Search engine, Web Crawler, HTTP, P2P, Breakpoint Continuingly

 


目  录
1绪论 1
 1.1 课题的背景和目的 1
 1.2 国内外研究现状及趋势 1
 1.2.1  搜索引擎 1
 1.2.2 文件下载 2
 1.3 课题研究的内容和意义 3
 1.4 本文的结构 4
2 技术概述 5
 2.1 正则匹配 5
 2.2 XML 5
 2.3 搜索引擎的原理 6
 2.4 线程 7
 2.4.1 线程 7
 2.4.2 多线程 8
 2.5 MP3标签信息 9
 2.6 HTTP协议 9
 2.7 PageRank算法 10
 2.8 本章小结 11
3 系统的设计与实现 12
 3.1 系统流程图 12
 3.2 MP3爬虫算法 13
 3.2.1 广度优先遍历策略 13
 3.2.2 基于本课题的爬虫算法改进 14
 3.2.3 解析HTML 15
 3.3 MP3标签 15
 3.3.1 MP3标签提取 15
 3.3.2  MP3标签存储 17
 3.4 文件下载 17
 3.4.1 断点续传 17
 3.4.2 批量下载 18
 3.4.3 文件重命名 20
 3.4.4 下载速度,进度,剩余下载时间的计算 21
 3.5 .ini配置文件 22
 3.6 delegate 和event自定义事件 22
 3.7 本章小结 23
4 试验结果分析 24
 4.1 网络爬虫 24
 4.2 查询 25
 4.3 文件下载 25
 4.4 结果分析 26
 4.5 本章小结 27
5 总结和展望 28
 5.1 总结 28
 5.2 展望 28
致 谢 30
参考文献 31


MP3下载器的设计与实现......
(作者:佚名 编辑:admin)
文章热词:下载 设计 实现
延伸阅读:

网友评论

 以下是对 [MP3下载器的设计与实现] 的评论,总共:0条评论

最新文章

推荐文章

热门文章