Commit Graph

50 Commits

Author SHA1 Message Date
Maks Snegov
c1724b5921 use base64 encoding for embedded scripts
can avoid some issues in browsers' renderers (habrahabr pages was broken
because of nested </script> in script content.
2014-10-04 03:38:34 +04:00
Maks Snegov
6b3aa602ef add script embedding 2014-10-04 03:24:38 +04:00
Maks Snegov
cf626546e7 use set of content-types for checking 2014-07-23 08:45:12 +04:00
Maks Snegov
fbf52e9544 add script parsing 2014-07-21 00:46:30 +04:00
Maks Snegov
7ce2bfb97f fix urllib.error.HTTPError print 2014-07-20 21:42:13 +04:00
Maks Snegov
41e984e1f0 fix urllib.error.HTTPError calls 2014-07-20 21:40:14 +04:00
Maks Snegov
fb3870e9dd skip http error pages 2014-07-20 17:31:43 +04:00
Maks Snegov
09346f4a70 fix: error with css charsets if no base charset 2014-07-20 17:31:15 +04:00
Maks Snegov
61d3d84a9c remove unused exception 2014-07-20 17:30:48 +04:00
Maks Snegov
b5ddae0ef8 fix css charset error, add urllib.error.httperror 2014-07-20 17:04:56 +04:00
Maks Snegov
964e79f97b add gzip encoding support 2014-07-20 14:03:49 +04:00
Maks Snegov
5c9d04cf3d use file with links as arguments 2014-07-20 13:48:18 +04:00
Maks Snegov
514b39d287 use default charset utf-8 if not set in headers 2014-07-20 13:31:20 +04:00
Maks Snegov
45f30ca9de fix: error with urls without scheme ('//ya.ru/index.html') 2014-07-20 13:30:22 +04:00
Maks Snegov
b58188b7b7 remove import 2014-07-20 13:29:56 +04:00
Maks Snegov
c523d025af add duplicate checking 2014-07-20 13:06:51 +04:00
Maks Snegov
a0fbb414a7 write url in the beginning of the file 2014-07-20 12:17:01 +04:00
Maks Snegov
716c61f6f1 replace http.client with urllib 2014-07-20 08:09:07 +04:00
Maks Snegov
eb2c43f438 ignore UTF-8 errors 2014-06-25 08:38:43 +04:00
Maks Snegov
6a818f4bb4 fix: error with empty GET urls 2014-06-23 00:50:21 +04:00
Maks Snegov
594ff71991 add css embedding 2014-06-22 23:51:18 +04:00
Maks Snegov
754411b6b7 remove unused header from request 2014-06-22 22:57:42 +04:00
Maks Snegov
a7ef8a8b7b separate complete_url function 2014-06-22 22:56:43 +04:00
Maks Snegov
35f755005d fix: do not work with GET arguments 2014-06-22 13:12:35 +04:00
Maks Snegov
fe69eff79b fix increment postfix in filenames 2014-06-22 12:38:05 +04:00
Maks Snegov
5c87f241d1 clean title from multiple whitespaces 2014-06-22 12:24:10 +04:00
Maks Snegov
ae63ca6318 skip connRefusedError pictures 2014-06-22 12:16:10 +04:00
Maks Snegov
36be68d78d fix title with attributes parsing 2014-06-22 11:59:02 +04:00
Maks Snegov
ab03e18ce2 fix relative urls 2014-06-22 11:48:04 +04:00
Maks Snegov
5b91bef896 add infinite redirects blocking 2014-06-22 11:47:21 +04:00
Maks Snegov
11de357865 add image embedding 2014-06-22 11:45:37 +04:00
Maks Snegov
5837451ed7 add url as comment to saved pages 2014-06-21 20:23:25 +04:00
Maks Snegov
e2009e7f08 skip fname duplicates 2014-06-21 20:09:15 +04:00
Maks Snegov
ab9a7e34c1 get title name 2014-06-21 09:58:47 +04:00
Maks Snegov
aead01258d remove never used if condition 2014-06-21 09:43:12 +04:00
Maks Snegov
ae4a9b986e add gzip support 2014-06-17 22:31:02 +04:00
Maks Snegov
2666d7911a no scheme in url fix 2014-06-17 22:28:54 +04:00
Maks Snegov
5b05f3e8d0 separate download_content() from get_page() 2014-06-17 22:26:12 +04:00
Maks Snegov
2f6c877493 fix: URL with no schema will raise error 2014-06-15 20:16:35 +04:00
Maks Snegov
7e43162920 rewrite HTML title parser 2014-06-01 23:20:42 +04:00
Maks Snegov
af948ff6fc move shell script to deprecated dir 2014-06-01 21:28:06 +04:00
Maks Snegov
6cbfec5067 set result file name by page title 2013-12-24 23:00:43 +04:00
Maks Snegov
fe61491292 add redirect support 2013-11-10 00:15:30 +04:00
Maks Snegov
67b7dc81e9 fix charset from response header
there are can be headers withous charset, like
Content-Type: text/html
2013-11-09 22:39:35 +04:00
Maks Snegov
5818b0e096 determine charset from response header 2013-11-09 22:01:43 +04:00
Maks Snegov
36b407e86c init nevernote, python version 2013-11-09 21:20:53 +04:00
Maks Snegov
c8fcdd6241 Fix bug: if result dir (notebook name or todo) doesn't exist,
crash/
2012-10-07 21:19:07 +04:00
Maks Snegov
c08d3da905 Fix bug: if .nevernote doesn't exist in home dir, script will stop 2012-10-07 17:31:29 +04:00
Maks Snegov
64fa37f0af Move config files to dot-dir in userhome.
Add notebook supports.
Do not create sub-directories.
2012-10-07 15:26:44 +04:00
Maks Snegov
1fa1606f0d Initial commit. 2012-08-19 02:46:46 +04:00