I'm having trouble generating WARC files. I tried to add a Chrome extension WARCreate, but it doesn't seem to work. Others on the internet are saying WARCreate doesn't work with newer versions of Chrome. I can only assume they are correct.
Next, I tried using Conifer (it was previously known as Webrecorder) to create a WARC file. The actual creation seems to go OK. I'm having trouble understanding how to get it out of Conifer and successfully adding it to Preservica. I've tried setting up a folder on my computer to put WARC files into and then ingesting from there, but something odd has happened. The "https://" part of the URL changes to "https---"
Any clues as to what I can do differently?
Hi Joan,
We can't really comment on Conifer as it is a 3rd party product, but this page in their help guide appears to cover off downloading the warc file - https://guide.conifer.rhizome.org/docs/manage-sessions/exporting-warc/
Once the file is downloaded you have a warc file that isn't actually a warc! It is a zip file masquerading as a warc, so you need to:
Unzip the file
Rename the file to give it a .zip extension
Then upload into Starter
Once you go into the asset you will be prompted to add the 'seed' URL, basically the address of the page the warc file represents eg. https://preservica.com
Now it should be viewable
Hi Steve,
Could you elaborate on how/any resources to unzip a WARC file? I don't get unzipping or extracting as an option when I right-click the file, and am having trouble locating any tools or resources that can seem to do it. thanks!
Steve Crawford
Ingest of WARC files
You are now able to ingest and render WARC files in Starter. A WARC file is an archive that contains information about a website that was gathered from "crawls" performed by Internet bots for archival purposes. It stores WARC records, which may include information about the HTML, CSS, images, video, and scripts used by websites. WARC files also include metadata about how and where the web information was retrieved.
Drag-and-drop your existing WARC files into Starter or add individual files to your Starter application via the add button as you would with any other file type.
Once the files are uploaded, you are reminded to record the URL of the site that was crawled on the Metadata Properties page, where you can also edit it later as shown below, When you click on the eye icon for the asset you will be able view the content of the associated website.
Choose preference for toast messages (screen notifications)
You are now able to set your preference for toast messages. These are small message that are displayed in a box at the bottom of the screen to provide feedback about a requested action. You can configure these messages in your user account, in the menu item Notifications.
By default, toast messages are set to remain active until you manually close them.
If you wish to turn on automatic closure you can do so by setting the toggle switch to “on”. You will receive a warning message upon changing the default option, just click” OK, I want to continue”, and the toggle will switch to “on”.
When the toggle switch is on, all notifications will close automatically.