<html><body><div style="color:#000; background-color:#fff; font-family:times new roman, new york, times, serif;font-size:12pt"><div style="font-family: 'times new roman', 'new york', times, serif; font-size: 12pt;">Caro Teodoro,</div><div style="font-family: 'times new roman', 'new york', times, serif; font-size: 12pt;">Boa noite.</div><div style="font-family: 'times new roman', 'new york', times, serif; font-size: 12pt;"><br></div><div style="font-family: 'times new roman', 'new york', times, serif; font-size: 16px; color: rgb(0, 0, 0); background-color: transparent; font-style: normal;">Veja esse link <a href="http://r-br.2285057.n4.nabble.com/R-br-r-baixando-dados-inmet-td4660459.html" style="font-size: 12pt;">http://r-br.2285057.n4.nabble.com/R-br-r-baixando-dados-inmet-td4660459.html</a></div><div style="font-family: 'times new roman', 'new york', times, serif; font-size: 16px; color: rgb(0, 0, 0); background-color: transparent; font-style:
normal;"><br></div><div style="font-family: 'times new roman', 'new york', times, serif; font-size: 16px; color: rgb(0, 0, 0); background-color: transparent; font-style: normal;">Exite outras maneiras, mas você precisa saber python, assim segue outro exemplos.</div><div style="font-family: 'times new roman', 'new york', times, serif; font-size: 16px; color: rgb(0, 0, 0); background-color: transparent; font-style: normal;"><br></div><div style="background-color: transparent;">from ghost import Ghost</div><div style="background-color: transparent;">from bs4 import BeautifulSoup as bs</div><div style="background-color: transparent;">import re</div><div style="background-color: transparent;">import time</div><div style="background-color: transparent;">import regex</div><div style="background-color: transparent;">from bs4 import BeautifulSoup</div><div style="background-color: transparent;">import time</div><div style="background-color: transparent;">ghost
= Ghost()</div><div style="background-color: transparent;"><br></div><div style="background-color: transparent;">def load_account():</div><div style="background-color: transparent;"> ghost.open("http://www.inmet.gov.br/projetos/rede/pesquisa/inicio.php")</div><div style="background-color: transparent;"> ghost.fill("form", {"mCod":"alissonluc@yahoo.com.br", "mSenha":"bv1k0wgj"})</div><div style="background-color: transparent;"> ghost.set_field_value("input.botao", " Acessar ")</div><div style="background-color: transparent;"> ghost.click("input.botao", expect_loading=True)</div><div style="background-color: transparent;"><br></div><div style="background-color: transparent;">load_account()</div><div style="background-color: transparent;"><br></div><div style="background-color: transparent;">ghost.open("http://www.inmet.gov.br/projetos/rede/pesquisa/form_mapas_c_horario.php")</div><div
style="background-color: transparent;"><br></div><div style="background-color: transparent;">ghost.fill("form", {"mRelDtInicio":"01/07/2012", </div><div style="background-color: transparent;"> "mRelDtFim":"01/08/2012",</div><div style="background-color: transparent;"> "mRelEstado":"MG",</div><div style="background-color: transparent;"> "mRelRegiao":"4",</div><div style="background-color: transparent;"> "mOpcaoAtrib1":"0",</div><div style="background-color: transparent;"> "mOpcaoAtrib2":"0",</div><div style="background-color: transparent;">
"mOpcaoAtrib5":"0",</div><div style="background-color: transparent;"> "mOpcaoAtrib6":"0",</div><div style="background-color: transparent;"> "mOpcaoAtrib8":"0",</div><div style="background-color: transparent;"> "mOpcaoAtrib9":"0",</div><div style="background-color: transparent;"> "mOpcaoAtrib12":"0"})</div><div style="background-color: transparent;"><br></div><div style="background-color: transparent;">ghost.evaluate("document.frmCad.submit()", expect_loading=True)</div><div style="background-color: transparent;"><br></div><div style="background-color: transparent;">ghost.capture_to("/Users/Alisson/Desktop/lixo.png")</div><div style="background-color: transparent;"><br></div><div
style="background-color: transparent;">soup = BeautifulSoup(ghost.content)</div><div style="background-color: transparent;"><br></div><div style="background-color: transparent;">urls = regex.findall(r"http://www.inmet.gov.br/projetos/rede/pesquisa/gera_serie_txt.php?[^ ]*", ghost.content)</div><div style="background-color: transparent;"><br></div><div style="background-color: transparent;">tables = {}</div><div style="background-color: transparent;">errors = []</div><div style="background-color: transparent;">for url in urls:</div><div style="background-color: transparent;"> print url</div><div style="background-color: transparent;"> try:</div><div style="background-color: transparent;"> ghost.open(url)</div><div style="background-color: transparent;"> soup = BeautifulSoup(ghost.content)</div><div style="background-color: transparent;"> except:</div><div
style="background-color: transparent;"> errors.append([url])</div><div style="background-color: transparent;"> next</div><div style="background-color: transparent;"> try:</div><div style="background-color: transparent;"> cidade = regex.findall(r"Esta.*?o\s*?:\s([A-Z|\s]*-\s[A-Z]*)", soup.pre.get_text())[0]</div><div style="background-color: transparent;"> table = regex.findall(r"(Estacao;Data;Hora((.|\n)*))", soup.pre.get_text())[0][0]</div><div style="background-color: transparent;"> tables[cidade] = table</div><div style="background-color: transparent;"> time.sleep(2)</div><div style="background-color: transparent;"> except IndexError:</div><div style="background-color: transparent;"> errors.append([soup.pre])</div><div
style="background-color: transparent;"> next</div><div style="background-color: transparent;"><br></div><div style="background-color: transparent;"><br></div><div style="background-color: transparent;">for cidade, tabela in tables.iteritems():</div><div style="background-color: transparent;"> f = open("/Users/Alisson/Desktop/" + cidade + ".txt", "w")</div><div style="background-color: transparent;"> f.write(table)</div><div style="background-color: transparent;"> f.close()</div><div style="font-family: 'times new roman', 'new york', times, serif; font-size: 12pt;"><br></div><div style="font-family: 'times new roman', 'new york', times, serif; font-size: 12pt;">Abracos</div><div style="font-family: 'times new roman', 'new york', times, serif; font-size: 12pt;"></div><div style="font-family: 'times new roman', 'new york', times, serif; font-size: 12pt;"> </div><div style="font-family:
'times new roman', 'new york', times, serif; font-size: 12pt;"><font face="times new roman, new york, times, serif">Alisson Lucrécio da Costa</font><br></div> <div style="font-family: 'times new roman', 'new york', times, serif; font-size: 12pt;"> <div style="font-family: 'times new roman', 'new york', times, serif; font-size: 12pt;"> <div dir="ltr"> <hr size="1"> <font size="2" face="Arial"> <b><span style="font-weight:bold;">From:</span></b> Teodoro Calvo <teocalvo2@gmail.com><br> <b><span style="font-weight: bold;">To:</span></b> r-br@listas.c3sl.ufpr.br <br> <b><span style="font-weight: bold;">Sent:</span></b> Wednesday, October 2, 2013 8:43 PM<br> <b><span style="font-weight: bold;">Subject:</span></b> [R-br] Usar R p/ importar informações da web<br> </font> </div> <div class="y_msg_container"><br>Olá, boa noite.<br clear="none"><br clear="none">Como posso extrair uma parte de um texto de determinado site, utilizando <br
clear="none">o R ?<br clear="none">É possível ? Existe algum material ?<br clear="none"><br clear="none">Obrigado desde já.<br clear="none"><br clear="none">Att Téo Calvo.<div class="yqt5599790095" id="yqtfd31308"><br clear="none">_______________________________________________<br clear="none">R-br mailing list<br clear="none"><a shape="rect" ymailto="mailto:R-br@listas.c3sl.ufpr.br" href="mailto:R-br@listas.c3sl.ufpr.br">R-br@listas.c3sl.ufpr.br</a><br clear="none"><a shape="rect" href="https://listas.inf.ufpr.br/cgi-bin/mailman/listinfo/r-br" target="_blank">https://listas.inf.ufpr.br/cgi-bin/mailman/listinfo/r-br</a><br clear="none">Leia o guia de postagem (<a shape="rect" href="http://www.leg.ufpr.br/r-br-guia" target="_blank">http://www.leg.ufpr.br/r-br-guia</a>) e forneça código mínimo reproduzível.</div><br><br></div> </div> </div> </div></body></html>