⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 libcurl-tutorial.html

📁 功能最强大的网络爬虫,希望大家好好学习啊,好好研究啊
💻 HTML
📖 第 1 页 / 共 5 页
字号:
 <p class="level0">Ok, so what if you want to post binary data that also requires you to set the Content-Type: header of the post? Well, binary posts prevents libcurl from being able to do strlen() on the data to figure out the size, so therefore we must tell libcurl the size of the post data. Setting headers in libcurl requests are done in a generic way, by building a list of our own headers and then passing that list to libcurl. <p class="level0"><pre><p class="level0">&nbsp;struct curl_slist *headers=NULL; &nbsp;headers = curl_slist_append(headers, "Content-Type: text/xml"); <p class="level0">&nbsp;/* post binary data */ &nbsp;curl_easy_setopt(easyhandle, CURLOPT_POSTFIELDS, binaryptr); <p class="level0">&nbsp;/* set the size of the postfields data */ &nbsp;curl_easy_setopt(easyhandle, CURLOPT_POSTFIELDSIZE, 23); <p class="level0">&nbsp;/* pass our list of custom made headers */ &nbsp;curl_easy_setopt(easyhandle, CURLOPT_HTTPHEADER, headers); <p class="level0">&nbsp;curl_easy_perform(easyhandle); /* post away! */ <p class="level0">&nbsp;curl_slist_free_all(headers); /* free the header list */ <p class="level0">While the simple examples above cover the majority of all cases where HTTP POST operations are required, they don't do multi-part formposts. Multi-part formposts were introduced as a better way to post (possibly large) binary data and was first documented in the RFC1867. They're called multi-part because they're built by a chain of parts, each being a single unit. Each part has its own name and contents. You can in fact create and post a multi-part formpost with the regular libcurl POST support described above, but that would require that you build a formpost yourself and provide to libcurl. To make that easier, libcurl provides <a class="emphasis" href="./curl_formadd.html">curl_formadd(3)</a>. Using this function, you add parts to the form. When you're done adding parts, you post the whole form. <p class="level0">The following example sets two simple text parts with plain textual contents, and then a file with binary contents and upload the whole thing. <p class="level0"><pre><p class="level0">&nbsp;struct curl_httppost *post=NULL; &nbsp;struct curl_httppost *last=NULL; &nbsp;curl_formadd(&post, &last, &nbsp;             CURLFORM_COPYNAME, "name", &nbsp;             CURLFORM_COPYCONTENTS, "daniel", CURLFORM_END); &nbsp;curl_formadd(&post, &last, &nbsp;             CURLFORM_COPYNAME, "project", &nbsp;             CURLFORM_COPYCONTENTS, "curl", CURLFORM_END); &nbsp;curl_formadd(&post, &last, &nbsp;             CURLFORM_COPYNAME, "logotype-image", &nbsp;             CURLFORM_FILECONTENT, "curl.png", CURLFORM_END); <p class="level0">&nbsp;/* Set the form info */ &nbsp;curl_easy_setopt(easyhandle, CURLOPT_HTTPPOST, post); <p class="level0">&nbsp;curl_easy_perform(easyhandle); /* post away! */ <p class="level0">&nbsp;/* free the post data again */ &nbsp;curl_formfree(post); <p class="level0">Multipart formposts are chains of parts using MIME-style separators and headers. It means that each one of these separate parts get a few headers set that describe the individual content-type, size etc. To enable your application to handicraft this formpost even more, libcurl allows you to supply your own set of custom headers to such an individual form part. You can of course supply headers to as many parts you like, but this little example will show how you set headers to one specific part when you add that to the post handle: <p class="level0"><pre><p class="level0">&nbsp;struct curl_slist *headers=NULL; &nbsp;headers = curl_slist_append(headers, "Content-Type: text/xml"); <p class="level0">&nbsp;curl_formadd(&post, &last, &nbsp;             CURLFORM_COPYNAME, "logotype-image", &nbsp;             CURLFORM_FILECONTENT, "curl.xml", &nbsp;             CURLFORM_CONTENTHEADER, headers, &nbsp;             CURLFORM_END); <p class="level0">&nbsp;curl_easy_perform(easyhandle); /* post away! */ <p class="level0">&nbsp;curl_formfree(post); /* free post */ &nbsp;curl_slist_free_all(post); /* free custom header list */ <p class="level0">Since all options on an easyhandle are "sticky", they remain the same until changed even if you do call <a class="emphasis" href="./curl_easy_perform.html">curl_easy_perform(3)</a>, you may need to tell curl to go back to a plain GET request if you intend to do such a one as your next request. You force an easyhandle to back to GET by using the CURLOPT_HTTPGET option: <p class="level0">&nbsp;curl_easy_setopt(easyhandle, CURLOPT_HTTPGET, TRUE); <p class="level0">Just setting CURLOPT_POSTFIELDS to "" or NULL will *not* stop libcurl from doing a POST. It will just make it POST without any data to send! <p class="level0"></pre><a name="Showing"></a><h2 class="nroffsh">Showing Progress</h2><p class="level0"><p class="level0">For historical and traditional reasons, libcurl has a built-in progress meter that can be switched on and then makes it presents a progress meter in your terminal. <p class="level0">Switch on the progress meter by, oddly enough, set CURLOPT_NOPROGRESS to FALSE. This option is set to TRUE by default. <p class="level0">For most applications however, the built-in progress meter is useless and what instead is interesting is the ability to specify a progress callback. The function pointer you pass to libcurl will then be called on irregular intervals with information about the current transfer. <p class="level0">Set the progress callback by using CURLOPT_PROGRESSFUNCTION. And pass a pointer to a function that matches this prototype: <p class="level0"><pre><p class="level0">&nbsp;int progress_callback(void *clientp, &nbsp;                      double dltotal, &nbsp;                      double dlnow, &nbsp;                      double ultotal, &nbsp;                      double ulnow); <p class="level0">If any of the input arguments is unknown, a 0 will be passed. The first argument, the 'clientp' is the pointer you pass to libcurl with CURLOPT_PROGRESSDATA. libcurl won't touch it. <p class="level0"></pre><a name="libcurl"></a><h2 class="nroffsh">libcurl with C++</h2><p class="level0"><p class="level0">There's basically only one thing to keep in mind when using C++ instead of C when interfacing libcurl: <p class="level0">The callbacks CANNOT be non-static class member functions <p class="level0">Example C++ code: <p class="level0"><pre><p class="level0">class AClass { &nbsp;   static size_t write_data(void *ptr, size_t size, size_t nmemb, &nbsp;                            void *ourpointer) &nbsp;   { &nbsp;     /* do what you want with the data */ &nbsp;   } &nbsp;} <p class="level0"></pre><a name="Proxies"></a><h2 class="nroffsh">Proxies</h2><p class="level0"><p class="level0">What "proxy" means according to Merriam-Webster: "a person authorized to act for another" but also "the agency, function, or office of a deputy who acts as a substitute for another". <p class="level0">Proxies are exceedingly common these days. Companies often only offer Internet access to employees through their HTTP proxies. Network clients or user-agents ask the proxy for documents, the proxy does the actual request and then it returns them. <p class="level0">libcurl has full support for HTTP proxies, so when a given URL is wanted, libcurl will ask the proxy for it instead of trying to connect to the actual host identified in the URL. <p class="level0">The fact that the proxy is a HTTP proxy puts certain restrictions on what can actually happen. A requested URL that might not be a HTTP URL will be still be passed to the HTTP proxy to deliver back to libcurl. This happens transparently, and an application may not need to know. I say "may", because at times it is very important to understand that all operations over a HTTP proxy is using the HTTP protocol. For example, you can't invoke your own custom FTP commands or even proper FTP directory listings. <p class="level0"><p class="level0"><a name="Proxy"></a><span class="nroffip">Proxy Options</span> <p class="level1"><p class="level1">To tell libcurl to use a proxy at a given port number: 

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -