FTP links are not caught! Whats happening?
FTP files might be seen as external links, especially if they are located in outside domain. You have either to accept all external links (See the links options, -n option) or only specific files (see filters section). Example: You are downloading http://www.someweb.com/foo/ and can not get ftp://ftp.someweb.com files Then, add the filter rule +ftp.someweb.com/* to accept all files from this (ftp) location Q: I got some weird messages telling that robots.txt do not allow several files to be captured. What’s going on? A: These rules, stored in a file called robots.txt, are given by the website, to specify which links or folders should not be caught by robots and spiders – for example, /cgi-bin or large images files. They are followed by default by HTTrack, as it is advised. Therefore, you may miss some files that would have been downloaded without these rules – check in your logs if it is the case: Info: Note: due to www.foobar.com remote robots.txt rules, links begining with these path