Product Page Processing Methods
From Salish Sea Wiki
Technical Pages
- Categorization System
- Category Update and Page Editing Forms
- Development Log
- Development Roadmap
- Iconography
- Integrating SER Standards into Wiki Structures
- New User Experience
- Page Preview Troubleshooting
- Platform Technical Design
- Product Page Processing Methods/File Pages
- Product Page Processing Methods/Main Pages
- Product Page Processing Methods/No Year
- For all pages containing {{Product}}
- If namespace is not "File" or "Main" then flag as "Product in other namespace" and END
- No other namespaces contain:
{{Product}}
or{{product}}
. - There are 467 pages in the File namespace containing either:
{{Product}}
or{{product}}
(24 & 443 respectively), see list here. - There are 190 pages in the Main namespace containing:
{{Product}}
or{{product}}
(29 & 161 respectively), see list here.
- No other namespaces contain:
- Where free text contains [[Category:Document]] Then:
- Look for four digit number in name string, and make YEAR = four-digit Number, if there is no number then flag as "Document with no year" (a few.. )
- There are 87 pages that don't include a 4 digit year, see list here (note, this is just the count of all
{{Product}}
or{{product}}
, not necessarily those pages that also contain[[Category:Document]]
. This is expected, and some of these will need to shift to a new template, so some mechanism for replacing Product template with Cosmetic Graphic template will be useful. - This sounds fine, if it's supposed to be moved from the Product Template to the Cosmetic Graphic Template, we'd just need a mapping Let me know if this should be done before the data scraping (can change template using replace text) or if it can be integrated into the spreadsheet review (preference) - PLEASE INSTRUCT
- There are 87 pages that don't include a 4 digit year, see list here (note, this is just the count of all
- Take text to left of the number and make AUTHOR = left text string (strip of rightmost space)
- Assuming when there's no 4 digit year, this needs to be skipped as well.
- Take text to right of the number, starting at first non-space character and make TITLE = right text string; if no text then
- If there is no text, then what should be done? All of these should be redirect pages... which don't have the template, otherwise I will need to change page name manually.
- With the example of the page David et al 2014, it contains
{{product}}
, contains[[Category:Document]]
, has a 4 digit YEAR, has text to the left for AUTHOR, but does not have text to the right. If this is supposed to be a redirect, what does it redirect to? I "think" this may be a manual action, but the page probably should be moved so that it looks like "David et al 2014 Foraging and Growth Potential of Juvenile Chinook Salmon after Tidal Restoration". Then the original page (David et al 2014) would redirect to there? This is an example of a product that was created without the QA of the form... so it just needs a page move. I can do that manually (and just did for David et al), or I can perform the page move as part of the quality control review of page data. PLEASE INSTRUCT
- Take all categories and add as comma delimited structured data in the field CATEGORIES
- This can be done to add the page to the correct category, but since the tree selector is split into 6 parts, we need to add each category into the correct sub-tree (Geographic Place, Political Jurisdiction, Workgroup Origin, Anthropogenic Topic, Ecosystem Topic, Purpose). I presume this requires a lookup table... where you lookup the category, and see the sub-tree assignment? This raises the question: where is the place of truth for the organization of categories?
- Agreed, a lookup table would be needed to place all the categories in the correct fields. The truth for the organization of these categories is currently the category hierarchy used by Form:Product and defined by the Category pages Salish Sea, Jurisdiction, Workgroup, Anthropogenic Topics, Ecosystem Topics, & Effort. Categories have stabilized for the moment. SHOULD I DEVELOP THIS LOOKUP TABLE?
- Where namespace = File Then
- Create a new page with page name AUTHOR + YEAR + TITLE
- Create a link directly to the File Media at the top of the new page
- This should just be adding the File Media (pagename) to the structured data (i.e. the "Link To File" section of the Form or the "File" field), then the Product Template would handle laying out that fields value on the page, in this case the top of the page. Understood.
- Add all the free text and structured data created above to the new page
- Delete the original File:Page content and replace with #redirect[[NEWPAGENAME]] //or alternately a link to the main namespace product page?
- I'm not sure what this part means, "or alternately a link to the main namespace product page?" I wasn't sure what "best practice" might be... if there is any unexpected problem with using a redirect? Redirect is my impulse so that any old links to the file page go to the new main page?
- Redirects make sense, I just didn't understand the alternative. Maybe you just meant a link on the page vs. a redirect? If so, I think the redirect makes more sense. Agreed.
- Look for four digit number in name string, and make YEAR = four-digit Number, if there is no number then flag as "Document with no year" (a few.. )
- Where free text contains [[Category:Dataset]] Then:
- Are we sure there is no overlap? i.e. pages with both
[[Category:Document]]
&[[Category:Dataset]]
. - Not sure... but should not be... use order of operations to manage? preference Document sub-typing.
- Makese sense. If you order things (maybe like they here) then the order taken should flush things out unless they get re-run.
- Look for four digit number in name string, and make YEAR = Number, if there is no number then flag as "Document with no year" (a few.. )
- Take text to left of the number and make AUTHOR = left text string (strip of rightmost space)
- Same note as above section.
- Take text to right of the number, starting at first non-space character and make TITLE = right text string; if no text then
- If there is no text, then what should be done? same as above... requires manual name change.
- Take all categories and add as comma delimited structured data in the field CATEGORIES
- Same note as above section.
- Are we sure there is no overlap? i.e. pages with both
- Where free text contains [[Category:Graphic]] or [[Category:Image]] or [[Category:Map]] or [[Category:Diagram]]
- Are we sure there is no overlap? i.e. pages with both any of the above and
[[Category:Document]]
&[[Category:Dataset]]
.
- Replace {{Product}} with {{Picture}} //Right now, all the media in the graphic category are just images used to decorate pages. At a later date, they can be "promoted" to full Products if it is merited. We will need to start flagging and managing a set of File namespace media that are in the Cosmetic Image category (as opposed to an graphic that is a product with an author and year).
- Are we sure there is no overlap? i.e. pages with both any of the above and
- If contains [[Category:Webpage]]
- AUTHOR = Null, YEAR = Null
- TITLE = pagename
- Take all categories and add as comma delimited structured data in the field CATEGORIES
- Same note as above section.
- For any remaining product pages flag as "Product without subtype"
- If namespace is not "File" or "Main" then flag as "Product in other namespace" and END
//This is to clean up a number of pages in File that contain media that are categorized irregularly, and also assigns them to the template {{Cosmetic Graphic}} so they can be managed more easily later. I will need to do some manual work to sort these... some may become products, and otherwise I'll have to start managing cosmetic graphics separate from products.
- Where page does NOT contain {{Product}} but does contains [[Category:Graphic]] or [[Category:Image]] or [[Category:Map]][[Category:Diagram]] add {{Cosmetic Graphic}} at the top of free text. //This is the support the future cleanup described above.
- File:Acquisition.jpg
- File:Allen quilceda basins.jpg
- File:Butler11x17.pdf
- File:ButlerCrestline11x17.pdf
- File:Chehalis subbasins.jpg
- File:Counties.jpg
- Category:Dataset
- File:Decker-aerial.jpg
- File:Decker-pubowner.jpg
- File:Decker-soils.jpg
- Category:Document
- File:ECOSITE 160k.jpg
- File:Ecosystem Site Discovery Bay.png
- File:Ecosystem Site Port Gamble.png
- File:Ecosystem site lower skykomish floodplain.png
- File:Ecosystem site port susan bay.png
- File:Ecosystem site snow-salmon.png
- File:Ecosystem sites north jefferson.png
- File:Ecosystem sites north thurston.png
- File:Ecosystem sites port susan.png
- File:Estuary salmon recovery.png
- File:Estuary sites.jpg
- File:Floodplains.png
- File:French creek ditches.jpg
- File:French creek watershed.jpg
- File:French creek watershed2.jpg
- Category:Graphic
- File:GreenCove Population.jpg
- File:Green cove mud minnow.png
- File:Greencovecreek.jpg
- File:Greencovecreek.pdf
- File:Headwaters of the salish sea.jpg
- File:HendersonEcoregion.jpg
- File:Henderson Inlet Watershed.pdf
- File:Historical edmonds marsh.jpg
- File:Kitsap sediment assessment.jpg
- File:Lake Stevens.png
- File:Lopez.pdf
- File:Lopez waterflow.pdf
- File:LowStilly estuary elevation.pdf
- File:LowStilly estuary parcel.pdf
- File:LowStilly lower elevation.pdf
- File:LowStilly lower parcel.pdf
- File:LowStilly sylvana elevation.pdf
- File:LowStilly sylvana parcel.pdf
- File:LowStilly upper elevation.pdf
- File:LowStilly upper parcel.pdf
- File:Lower Stillaguamish.pdf
- File:Lower Stillaguamish Agriculture.pdf
- File:Lower Stillaguamish DEM.pdf
- File:Lower stillaguamish floodplain.png
- File:Lower stillaguamish floodplain parcels.pdf
- File:Mission creek watershed.png
- File:Nisqually delta current.jpg
- File:Olympia shoreline change.jpg
- File:Pierce levee setbacks.jpg
- File:Port Townsend ecosystem site.png
- File:Scatter creek.jpg
- File:Schneider creek GLO.jpg
- File:Schneider creek aerial.jpg
- File:Schneider creek overview.jpg
- File:Skagit-stillaguamish delta site.png
- File:Skokomish.png
- File:SnoSky Confluence DEM.jpg
- File:Snohomish Schematic.png
- File:Snohomish delta after.jpg
- File:Snohomish riparian zone management pilot.png
- File:Stillaguamish Delta LiDAR 2013b.jpg
- File:The Southernmost Salish Sea.pdf
- File:Thomas creek.jpg
- Category:Website
Notes
The Form and Structured data for Products have the following fields where "(*)" marks the field as mandatory, for all mandatory fields, we likely need to have a value for every page we are converting:
- Type Group (*)
- Authors Dataset (*)
- Year Dataset (*)
- Title Dataset (*)
- FileOrCitation Dataset (*)
- Type Document (*)
- Authors Document (*)
- Year Document (*)
- Title Document (*)
- FileOrCitation Document (*)
- Type Graphic (*)
- Authors Graphic (*)
- Year Graphic (*)
- Title Graphic (*)
- FileOrCitation Graphic (*)
- Type Website (*)
- Title Website (*)
- File
- Distribution (*)
- Places
- Jurisdictions
- Workgroups
- AnthroTopic
- EcoTopic
- Effort