Part of lp.services.apachelogparser
Function | get_files_to_parse | Return an iterator of file and position where reading should start. |
Function | get_fd_and_file_size | Return a file descriptor and the file size for the given file path. |
Function | parse_file | Parse the given file starting on the given position. |
Function | create_or_update_parsedlog_entry | Create or update the ParsedApacheLog with the given first_line. |
Function | get_day | Extract the day from the given date and return it as a datetime. |
Function | get_host_date_status_and_request | Extract the host, date, status and request from the given line. |
Function | get_method_and_path | Extract the method of the request and path of the requested file. |
The lines read from that position onwards will be the ones that have not been parsed yet.
Parameters | file_paths | The paths to the files. |
The file descriptor will have the default mode ('r') and will be seeked to the beginning.
The file size returned is that of the uncompressed file, in case the given file_path points to a gzipped file.
parsed_lines accepts the number of lines that have been parsed during previous calls to this function so they can be taken into account against max_parsed_lines. The total number of parsed lines is then returned so it can be passed back to future calls to this function.
Return a dictionary mapping file_ids (from the librarian) to days to countries to number of downloads.