Analysis
Overview
This group of tasks is about improving the base.persistence module and providing the new R3 resource format. After this task is complete, the following should be true:
- The base.persistence module will provide all the commons needed for persistence.
- The R3 format will be a Resource format, not just a Book format. This means that any kind of Resource (Book, Page, Frame etc..) could be saved in this format.
- Functionality for saving Immutables and ProObjects (Resources, in particular) will be present.
- An intermediate storage format will exist.
- Extensions and extension points for simplifying the usage and improving extensibility of the format.
Task requirements
The R3 format replaces the R2 format. These are the goals of these task:
- The R2 format is a file format for saving a given Book. Compared to it, the R3 format is a Resource format - it should be capable of saving any given Resource.
- This Resource-oriented format gives features that should be taken in mind when implementing it:
- In future it should allow saving/loading any Resource - Frame, Page, Annotation. This will be very useful if we consider Frame/Page editor for example. It would be also useful for saving templates.
- Improve machine and human readability of the format:
- XML aspect - redesign the used XML tags to be both human and machine readable.
- File structure - create a good and consistent structure inside the archive.
- Data standards - what goes where:
- XML aspect - XML should contain Immutables and metinfo. Define where does it belong in the structure
- Binary data - define location of the binary data for each ProObject (Resources in particular).
- Text data - define how to handle text contents; where to store them and in what form.
- Define how a Resource is located and accessed (ID, local path, URL).
- Support caching and offline content - currently cache mechanism doesn't exust. The file format needs to support cache and ability to save online resources.
- Caching however should be limited. In future releases we may consider GUI input for the limit, for example "100MB"
- Make the R3 format backward compatible with R2:
- The application should be able to load a book in R2 format and save it to R3 format.
- Optional - achieve saving a book in R2 format. This doesn't need to have GUI for saving in R2 but there may be a mechanism for converting R3 to R2.
- Support of saving online content. This may not be implemented now, but in future we may have to support downloading online Resources and saving them into the book.
- Also in the future, we need to refactor the Storage in order to implement lazy loading of Resources (and Resource preview, respectively).
Task result
- Source code
- Wiki content
Implementation idea
Define a tree structure and divide the book into folders.
- Each resource has its own folder and contains sub-folders of its subresources.
- Each folder contains an XML with properties (immutables) and information for the specific resource, and information about the subresources. The folder should also contain the binary data of the resource.
- The book's folder has a subfolder for cache. Consider should resources have own cache directory or all should use the book cache directory.
- Implement the annotations that were discussed - @Persist, @Immutable (see the video of the discussion for more information on these):
- An annotation for entities not to be persisted might be needed.
- (Optional) Provide a class that handles all Immutables coming from the JDK.
- Design and implement a format registry that keeps track of all formats and their features and limitations:
- see BASE_PERSISTENCE_FORMAT_REGISTRY_R0 for more ideas on that.
- Ensure backward compatibility.
- Consider forward compatibility in the design.
Related
GROUP_RESOURCE_R0
GROUP_CHANGE_R0?
http://asteasolutions.net/videos/
How to demo
- Explain the ideas of the format.
- Demo saving and loading of various Resources.
- Run the tests/demos.
- (Optional) Show the created wiki page.
Design
R3 Format Structure
The R3 format is not just a Book format but a Resource format. It represents a given Resource with its subResources, binary data, cached data, external Resource references etc.
Since Resources form a hierarchy (for instance Book -> Page -> Frame), the R3 format structure follows the same hierarchy.
The R3 format is a zipped file with some directory structure inside and some files:
- each Resource described inside (either the root Resource or any of its subResources) has a corresponding directory with some helper directories and files.
The exact structure description of the format is described below. In this example a Book is saved, though any kind of Resource can be also saved:
- The name of the zip file is something like "ExampleBook.s2b". This name without the file extension (namely "ExampleBook") is the name of the Book Resource.
- The root directory of the file contains:
- _cache - a directory for caching remote Resources.
- _data - a directory for storing binary and text data. Instead of embedding unreadable binary data in an .xml file, it is stored here and just referred from other places (like the .xml files)
- _resource.xml - an .xml file describing the structure, properties and subResources of the root Resource.
- other directories:
- for instance a "MeddleSubResource" directory corresponds to a subResource of the root Resource with name "MeddleSubResource". The structure of this subdirectory is analogical to this of the root Resource.
- certain Resources can contain other directories describing templates. Templating is still in design so conventions for them are still not finished.
Resource names cannot start with an underscore so they won't conflict with _cache, _data and _resource.xml.
_resource.xml
Here's a sample _resource.xml for the root Resource in ExampleBook ADD-LINK:
<?xml version="1.0" encoding="UTF-8"?> <resource kind="book" entity-id="e42f2de5-7463-439c-8eb3-25fa60424e7b" size="1254" mimeType="text/html" date-created="2009-04-13T16:15:00"> <!-- All resource common features are persisted as attributes of the resource tag. All specific book features are persisted as children. --> <title> Meddle In Wonderland (: </title> <page-size> <width>400</width> <height>500</height> </page-size> <pages> <page index="1"> <ref location="./MeddlePage" entity-id="5e0f117a-033a-4612-957c-3e0dab3b22a9" /> </page> <page index="0"> <ref location="./templatedPage" entity-id="dsfg34dfg41234" /> </page> </pages> <last-current-page> <ref location="./MeddlePage" entity-id="5e0f117a-033a-4612-957c-3e0dab3b22a9" /> </last-current-page> <frame-templates> <frame-template> <ref location="./goodFrameTemplate" entity-id="3f92f3ca-f581-4e10-9353-856ab1c32f2g" /> <!-- local resource--> </frame-template> <frame-template> <ref location="/home/mrsackless/Desktop/TanyaMemoBook/ChildhoodPage/lovelyFrame" entity-id="1e0f235a-033a-2314-957c-4e2dab6b47a1" /> <!-- outer resource--> </frame-template> </frame-templates> <page-templates/> <comment> A book everyone should read </comment> </resource>
Here are some explanations on the structure and content of this .xml file:
- It begins with a standard <xml> tag.
- All other content is wrapped in a <resource> tag which contains the name of the Resource, location, entityID and other attributes.
- Properties of the Resource are saved with respective tags and inside these tags are either some ResourceRefs or the saved form of an Immutable (depends on the value that the property holds).
- ListProperties are saved in a similar fashion, though each element from the given ListProperty has its own tag and the series of these tags are wrapped in a one bigger tag for the whole ListProperty. All list element tags have an attribute 'index'. This attribute is added because in the xml representation order of the children with the same name on the same level of the xml is not regarded. That means that in the list of pages for example the page elements could change their place in the xml representation but we do not want to lose information for the initial order in which they were.
- SubResources of the Resource are treated in the same fashion as Properties because they are Properties (:
- Templates persistence is still in design so they are treated in a special way.
Code Structure
Storage
The storage is something like xml Node. It is a node of the storage tree. Each storage may have several attributes and/or children, which are also storages. It could also have either text or binary content. The storage should be refactored:
- The old logic for registering loaders should be removed.
- Should be added methods that always return child or attribute by its name or name and index (for storages with more than one child with the same name). When it does not exist the method creates it and returns the newly created storage. Those methods should be:
- public Storage child(String name)
- public Storage attribute(String name)
- public Storage childAt(String name, int index)
Schema
The persistence schema is a string identifier of the needed persiser. it defines what is going to be persisted (the source), to/from what it is persisted (the target) and in what way (the format). Those three major parts of the Schema string are separated with '|', so it looks like this:
- "source|target|format"
The source and target are described hierarchically. Every step of the source and target hierarchies is separated from the previous one with ':'. Here are some example schemas:
- "imm:point|storage|r3" - meaning "immutable point should be persisted to storage in r3 format".
- "resource:book|storage|r3" - meaning "book resource should be persisted to storage in r3 format".
- "content:image|storage|r3" - meaning "image content should be persisted to storage in r3 format".
- "storage:resource|package-file:zip|r3" - meaning "storage, representing resource, should be persisted to package-file which is zipped in r3 format".
Persister <S, T>
Persisters are generic classes responsible for the persistence of all objects. The base class for persisters - Persister<S, T> have two parameters - the source class and the target class, and two methods:
- String getSchema() - This method returns the schema for this persister.
- void persist(S source, T target, Mode mode) - This method is used to actually persist the given source object to the given target object. Whether the method is saving or loading depends on the mode parameter.
Persistence modes should be described in the Mode enum. For now there are two modes - SAVE and LOAD.
Refs
It is clear that you can not give a object you want to create while loading as a parameter. Create Ref base class for references to objects with two methods - set and get. It should be generic so that there are no unsave casts from Object to specific referenced type. Create two different types of Refs:
- ValueRefs keep references to a value objects. Those values are initialized by the method getInnitial(). It is used only if a get() method is called before the first set(T value).
- PropRefs are created on the base of an existing RwProp<T>. Those references use the standard set and get methods of properties.
Modularity
Move the usefull logic from the two modules persistence.r1 and persistence.r2, remove them and fix the pom files and cyclic dependencies. base.persistence module should incorporate:
- All base classes should be in it. - Persister<S,T>, Mode, refs
- Storage and storage Utils
- Also all basic persisters for immutables from the jdk or the base.commons.
It should provide extension point for registration of persisters. All persisters of a non basic objects should be placed and registered in the modules of the object they are responsible for.
There should be class MasterPersister which tracks the registered persisters. All other persisters should use it to persist objects that are not their responsibility. The MasterPersister should find the appropriate persister for a given schema and call it with given parameters. This should be done in the static method void persist(Object source, Object target, Mode mode, String schema). If a persister with the given schema is not found an UnsupportedOperationException is thrown.
Logic
When a resource, for example book, is created, it is located in the 'tmp' location. Once a user descides to save it, he is provided with input file dialog so that he could choose appropriate location for his book. When the book is saved the tmp book is disposed and the newly saved book is opened from the local file system so that next time the user descides to save he won't have to choose the location again. However he could still choose to 'save as' his book, or in other word again change the location of the book that is currently oppened. The copy in the old location will stay in the last saved state until he descides to open it again by 'open Book'.
Already opened book should not be opened again. The book window of this book should be first closed and then should the book be reopened.
Tests
last three test of StorageTest, last test of ResourceLocalCacheTest, CommonsPersistenceTest, VideoFramePersistTest, BookPersistenceTest, last test of BookLogicIntegrTest
Implementation
For the implementation of the described design was needed more resources functionalities:
Persisters should create resource by its location and entityId so a method createResource(String entityId, String location, Class<T> resourceType) was added to the ResourceSpace. Its logic is very simple. It separates the given location by its last file separator in two parts - parent location and name. Then it uses the locate method from the ResourceLocalCache to locate the parent of the resource to be created and envoces already existing buildResource(String entityId, String resourceName, Class<T> resourceType, String parentLocation, Resource parent).
There was also need to discard a book when its bookwindow is closed so that it could be opened again from a file save. Three methods for removing a resource from cache were added to ResourceLocalCache.
- public void removeFromCache(Resource resource) - This method removes only the given resource from the local cache, but not its chidlren.
- public void removeAllFromCache(Resource resource) - This method removes from cache the given resource along with its resources tree. It iterates all child resources and removes them as well.
- public void removeFromCache(ResourceRef resourceRef) - This method removes only the pointed resource by a given ResourceRef.
Changesets: 2581, 2600, 2601, 2615, 2636, 2638, 2646, 2647, 2653, 2669, 2679, 2680, 2709, 2712, 2727,2808, 2817, 2824, 2826, 2827, 2829, 2910, 2918, 2939, 2940, 2964
Merged to the trunk in : [2970], [2971], [2972].
Testing
Place the testing results here.
Comments
Write comments for this or later revisions here.