Commit Graph

34 Commits

Author SHA1 Message Date
Maks Snegov
5c9d04cf3d use file with links as arguments 2014-07-20 13:48:18 +04:00
Maks Snegov
514b39d287 use default charset utf-8 if not set in headers 2014-07-20 13:31:20 +04:00
Maks Snegov
45f30ca9de fix: error with urls without scheme ('//ya.ru/index.html') 2014-07-20 13:30:22 +04:00
Maks Snegov
b58188b7b7 remove import 2014-07-20 13:29:56 +04:00
Maks Snegov
c523d025af add duplicate checking 2014-07-20 13:06:51 +04:00
Maks Snegov
a0fbb414a7 write url in the beginning of the file 2014-07-20 12:17:01 +04:00
Maks Snegov
716c61f6f1 replace http.client with urllib 2014-07-20 08:09:07 +04:00
Maks Snegov
eb2c43f438 ignore UTF-8 errors 2014-06-25 08:38:43 +04:00
Maks Snegov
6a818f4bb4 fix: error with empty GET urls 2014-06-23 00:50:21 +04:00
Maks Snegov
594ff71991 add css embedding 2014-06-22 23:51:18 +04:00
Maks Snegov
754411b6b7 remove unused header from request 2014-06-22 22:57:42 +04:00
Maks Snegov
a7ef8a8b7b separate complete_url function 2014-06-22 22:56:43 +04:00
Maks Snegov
35f755005d fix: do not work with GET arguments 2014-06-22 13:12:35 +04:00
Maks Snegov
fe69eff79b fix increment postfix in filenames 2014-06-22 12:38:05 +04:00
Maks Snegov
5c87f241d1 clean title from multiple whitespaces 2014-06-22 12:24:10 +04:00
Maks Snegov
ae63ca6318 skip connRefusedError pictures 2014-06-22 12:16:10 +04:00
Maks Snegov
36be68d78d fix title with attributes parsing 2014-06-22 11:59:02 +04:00
Maks Snegov
ab03e18ce2 fix relative urls 2014-06-22 11:48:04 +04:00
Maks Snegov
5b91bef896 add infinite redirects blocking 2014-06-22 11:47:21 +04:00
Maks Snegov
11de357865 add image embedding 2014-06-22 11:45:37 +04:00
Maks Snegov
5837451ed7 add url as comment to saved pages 2014-06-21 20:23:25 +04:00
Maks Snegov
e2009e7f08 skip fname duplicates 2014-06-21 20:09:15 +04:00
Maks Snegov
ab9a7e34c1 get title name 2014-06-21 09:58:47 +04:00
Maks Snegov
aead01258d remove never used if condition 2014-06-21 09:43:12 +04:00
Maks Snegov
ae4a9b986e add gzip support 2014-06-17 22:31:02 +04:00
Maks Snegov
2666d7911a no scheme in url fix 2014-06-17 22:28:54 +04:00
Maks Snegov
5b05f3e8d0 separate download_content() from get_page() 2014-06-17 22:26:12 +04:00
Maks Snegov
2f6c877493 fix: URL with no schema will raise error 2014-06-15 20:16:35 +04:00
Maks Snegov
7e43162920 rewrite HTML title parser 2014-06-01 23:20:42 +04:00
Maks Snegov
6cbfec5067 set result file name by page title 2013-12-24 23:00:43 +04:00
Maks Snegov
fe61491292 add redirect support 2013-11-10 00:15:30 +04:00
Maks Snegov
67b7dc81e9 fix charset from response header
there are can be headers withous charset, like
Content-Type: text/html
2013-11-09 22:39:35 +04:00
Maks Snegov
5818b0e096 determine charset from response header 2013-11-09 22:01:43 +04:00
Maks Snegov
36b407e86c init nevernote, python version 2013-11-09 21:20:53 +04:00