TEv2 - including images (perhaps also other kinds of files)

2021-05-11T06:37:23Z

When editing our documentation, I found that having `

TEv2 ways of referencing documents

2020-10-13T13:16:32Z

# TEv1 way of referencing In TEv1, documents are referable by the docusaurus mechanism for referencing, i.e. by using the docusaurus `id` attribute for referencing, optionally pre-pended by a (relative) path to the dictionary in which the referred-to document lives. For example, a document may refer to a document with attribute `id: some-id` by writing `%%shown text|some-id%%`, or `%%shown text|./some-id%%`. This mechanism works for terminologies under the assumption that there's only one terminology/scope to deal with. # TEv2 ways of referencing In TEv2 we need a referencing mechanism that allows us to deal with multiple scopes (namespaces). Now follows a suggestion for allowed syntax. In this suggestion, when we say 'in the same scope' or 'of the same type', this means 'same as: the document that contains the reference'. The idea of the syntax is that after selecting all documents that satisfy the specified selection criterion, only one document remains. If this is not the case, an error condition exists that should be 'thrown' to the author with a message that allows him to determine what went wrong AND how to fix it. - [ ] `%%shown-text|%% selects all documents with attribute `typeid: ` that are in the same scope. - [ ] `%%shown-text|-%% selects all documents with attributes `type: ` AND `typeid: ` that are in the same scope. - [ ] `%%shown-text|/%% selects all documents with attributes `scopeid: ` AND `typeid: `. - [ ] `%%shown-text|/-%% selects all documents with attributes `scopeid: ` AND `type: ` AND `typeid: `. Notes: - this syntax can be extended in future versions by replacing the `` in references by a 'path' of (nested) scopes, and perhaps even by an URI+'path-of-nested-scopes'. - a scopid is NOT a directory(name) per se. It SHOULD be possible to have a mapping between a (nest of) scopes and arbitrary directories, e.g. by means of some JSON object such as ~~~json { "scope-register": [ { "scopeID": "essifLab" , "scopename": "eSSIF-Lab" , "location": "https://gitlab.grnet.gr/essif-lab/framework/terminology" , "subscopes": [ { "scopeID": "essifLabTerminology" , "scopename": "eSSIF-Lab Terminology" , "location": "https://gitlab.grnet.gr/essif-lab/framework/terminology/terminology" }] }] } ~~~ Such a JSON object may be generated from scope-files (.md files that contain scope-data), or by having the admin of the system maintain that JSON object. # TEv2 shorthand references When writing this text, more than 71% (216/301) of the references in the eSSIF-Lab documentation were of the form `%%shown-text|reference-text%%` where `shown-text`, after converting it to all lower-case, would equal `reference-text`. This percentage rises to over 83% (252/301) if also spaces in `shown-text` were converted to `-` characters. This makes the case for specifying that `%%shown-text%%, provided it only consists of characters {[A-Z], [a-z], '-', ' '}, must be rewritten to `%%shown-text|reference-text%%` prior to further processing, where `reference-text` is the all-lowercase version of `shown-text` where every space is replaced by `-` character. If `shown-text` contains characters other than those allowed, an error conditions exists (which must be thrown to the author etc.)

TEv2 - generic document structure

2020-10-30T11:52:12Z

This issue specifies the structure of documents that are processable by TEv2 and/or its additions (generator-plugins). When approved, the text may become part of the eSSIF-Lab documentation that describes its terminology management. # Header Every file starts with a docusaurus header, i.e. ```md --- id: document-identifier title: title that will show when the document is rendered ... (other [docusaurus header items](https://v2.docusaurus.io/docs/markdown-features/#markdown-headers)) --- ``` ## TEv2 specific header attributes After the regular docusaurus header attributes, additional attributes must be specified that allow TEv2 to properly function. These attributes must be in the same header block as the docusaurus attributes (i.e. in the same block delimited by the `---` lines), as follows: ```md --- ... (docusaurus attributes) scopeid: type: typeid: --- ``` where: - ***scopeid*** (required) identifies the scope within which the document (and its contents) belongs, and is being defined/updated/... - ***type*** identifies the kinds of contents that can be expected in the document. The following types are being used (and should be supported): - **scope** the file contains the specification/description of a scope. - **concept** the file contains the definition of a term and a description of the concept to which it refers; - **term** the file contains the definition of a term and a reference to a concept; - **glossary** the file contains a specification for the generation of a glossary that is to become part of the same scope to which the glossary-file belongs; - **glossary** the file contains a specification for the generation of a dictionary that is to become part of the same scope to which the glossary-file belongs; Additional types may be defined as necessary. - ***typeid***: text that ensures the triple (scopeid,type,typeid) always identifies the document. ## TEv2 Generator-specific attributes Additional attributes may be added for purposes defined by different generators. For example: - `hoverText: ` is used by the documentation website generator, for the purpose of producing a popup when the user hovers over a reference to that document. This text may not include %%-references. - `glossaryText: ` is used by the glossary-generator as the explanation of a term. This text MAY contain %%-references where the `|` MUST be replaced by the `^` character (as the `|` character causes errors when it is part of a docusaurus-header). - `stage: ` is foreseen to be used in ToIP terminology life cycle management processes. Attributes that are not used by Docusaurus, TEv2 or a generator will be ignored (so you can leave them there if you don't use them).

Topics for consideration of long-term Terminology Engine developments

2020-10-09T09:07:14Z

Over the last days, as an author, I experienced some difficulties in writing terminology-files. To me, it is not clear which of these difficulties should/could be resolved, and which is just something we'll have to deal with (or defer to vsn _n_ of the terminology engine). I created this issue so that I can log the difficulties I run into, and we can discuss what to do with it (if anything), and when. I think we can live with these issues - at least for now. I can imagine that at some later stage (perhaps even after TEv2) more author-oriented parser/content processors might be constructed for use in a CI/CD pipeline that would resolve this issue, and they might have associated tools (e.g. vscode extensions) that help authors identify and correct any mistakes. ### 1. Referencing to pictures and documentation files that are not part of the terminology corpus Currently, authors need to be aware of the underlying software, in our case docusaurus when making such references. Drawbacks include: - the documentation corpus is not trivially transferrable to other systems (problems may include referring to other documents, or specifics in the markup); - constraints for writing documents come from multiple sources (that may be independently updated from each other). The implication might be that a parser/generator be developed that more or less replaces docusaurus, at least many of its features, which seems to be an overkill. ### 2. Docusaurus peculiarities While I very much like the docusaurus idea of having a header with key-value pairs that can be used for further processing, the body of such files is very much code-writer-oriented. For example, including an image requires the statement `import useBaseUrl from '@docusaurus/useBaseUrl';`, allowing the author to subsequently include the image `somefile.png` that has been stored in 'static/images/' using HTMLtags: ~~~ ~~~ Thinking in terms of our terminology model, it might be beneficial to treat an image in a similar way as a term (they're both semiotic [signs](https://en.wikipedia.org/wiki/Sign_(semiotics)), which would imply that they are stored in the location of the scope in which they have been defined. ## 3. Generator feedback to authors. I've seen the following happen: ![image](/uploads/3b3606ee7073ba26b3be78620c2b676e/image.png) It is hard for (authors like) me to find out what is wrong and how to correct this. Better ways of informing users would be welcome. If the Terminology Engine is to be productized, better support of its users in terms of identifying mistakes and helping them fix it seems a basic requirement.

Terminology Engine v2 specifications

2021-08-26T13:17:35Z

# Introduction Terminology Engine v1 (TEv1) is now operational; it enables us to not only define terminology, but also to actually put it to use and help readers of documentation understand what they read, and drill down to level of understanding that they seek (some are satisfied with the popups that appear on defined terms, others will click on the term to get further context and backgrounds). That is: after the existing documentation has been updated to take the full advantage of this tool. Two of the major objectives of the eSSIF-Lab project include (a) the creation of an ecosystem of developers (and users) that continues to thrive after the project has died, and (b) ensure interoperability, which is not just about the technology, but about the documentation as well. Given that different parties (subgrantees) will have use different vocabularies/terminologies, we seek for ways to ensure terminological interop while accommodating each party's specific terms. # Purpose The main objective of TEv2 would then be to facilitate the new version of the eSSIF-Lab architecture, that is going to be constructed by eSSIF-Lab consortium partner(s), every subgrantee of the infrastructure call, and each subgrantee of BOC#1 that produces (open source) components that extend the infrastructure. The resulting architecture should contain a functional description of each of these components, and this description must be embedded in the 'integral whole'. # High-level requirements The high-level requirements of TEv2 are: - [ ] Writing terminology and other documentation shall be easy for different authors. This means that they must not be required to go through an extensive learning-cycle. Also, they should be provided tooling that helps them to write documentation that passes the CI-pipe without any problems, specifically by not only pointing out whether or not they made a mistake, but as much as possible by identifying where, in their own documents, the cause of the error lies, and what it takes to fix it. - [ ] It should be possible for authors to keep their documentation (including term-definitions) separate from that of other subgrantees (in a 'scope' of their own), so that they can document their stuff in an internally consistent/coherent way without being bothered by terms that others use. This scope may also serve as a namespace. Ideally, they should be able to write their documentation irrespective of the implementation of the TEv2 (meaning that it would work equally well if TEv2 were implemented as a Docusaurus extension or as a stand-alone tool (that e.g. could update a Confluence or wiki site). - [ ] It shall be possible to work with multiple such scopes (see the terminology-pattern when that becomes available). A scope has its own vocabulary, which consists of the terms (and other stuff) defined within that scope, supplemented with terms that are defined in other scopes/namespaces (where the term itself may be changed if that is appropriate for that scope). - [ ] It shall be possible for authors to use terms in their documentation files that have been defined within the scope in which that file is created, where 'using a term' means that it is made to stand out when rendering the document, that it has a popup, and that clicking on it navigates the user to the place where it is defined (as in TEv1). It shall also be possible to use terms that are defined in another scope, provided that the scope supports TEv2 functionality. - [ ] For existing terminologies, such as that of NIST, DIF, Sovrin, etc., that do not support TEv2 functionality, we may decide to create a scope within the eSSIF-Lab context that does support TEv2 functionality, thus allowing authors to refer to such terms with all the benefits that TEv2 brings. - [ ] It shall be possible (for technically sufficiently capable engineers), to create generators that take one or more source file(s) from documentation authors, and create artifacts from that, e.g. for making dictionaries, concept-graphs, statistics and more. Note that this item only requires that such engineers be facilitated to create such generators. Nevertheless, for maintainability, it is suggested that every scope should contain a directory `generator-specs`, that contain the files that generators use to generate their artifacts. Such specification files would conform to the generic TEv2-document-template described above, where their `type`-attribute would correspond with the generator. So a file that would specify an artifact to be generated by e.g. the xxx-generator would specify `xxx` in its `type` field (so `glossary` and `dictionary` would be valid `xxx`s, but we can think of (but not necessarily implement more).

eSSIF-Lab framework issues

TEv2 - including images (perhaps also other kinds of files)

TEv2 ways of referencing documents

TEv2 - generic document structure

Topics for consideration of long-term Terminology Engine developments

Terminology Engine v2 specifications