/alps/pcitool

To get this branch, use:
bzr branch http://suren.me/webbzr/alps/pcitool
242 by Suren A. Chilingaryan
Initial support for event engines
1
Environment
2
===========
3
 PCILIB_PLUGIN_DIR - override path to directory with plugins
4
76 by Suren A. Chilingaryan
Handle correctly reference counting in the driver
5
Memory Addressing
6
=================
7
 There is 3 types of addresses: virtual, physical, and bus. For DMA a bus
8
 address is used. However, on x86 physical and  bus addresses are the same (on
9
 other architectures it is not guaranteed). Anyway, this assumption is still
10
 used by xdma driver, it uses phiscal address for DMA access. I have ported
11
 in the same way. Now, we need to provide additionaly bus-addresses in kmem
12
 abstraction and use it in NWL DMA implementation.
13
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
14
DMA Access Synchronization
15
==========================
16
 - At driver level, few types of buffers are supported:
17
    * SIMPLE - non-reusable buffers, the use infomation can be used for cleanup
18
    after crashed applications.
19
    * EXCLUSIVE - reusable buffers which can be mmaped by a single appliction
20
    only. There is two modes of these buffers:
21
	+ Buffers in a STANDARD mode are created for a single DMA operation and
22
	if such buffer is detected while trying to reuse, the last operation
23
	has failed and reset is needed.
24
	+ Buffers in a PERSISTENT mode are preserved between invocations of
25
	control application and cleaned up only after the PERSISTENT flag is 
26
	removed
27
    * SHARED - reusable buffers shared by multiple processes. Not really 
28
    needed at the moment.
29
30
    KMEM_FLAG_HW - indicates that buffer can be used by hardware, acually this
31
    means that DMA will be enabled afterwards. The driver is not able to check
32
    if it really was enable and therefore will block any attempt to release 
33
    buffer until KMEM_HW_FLAG is passed to kmem_free routine as well. The later
34
    should only called with KMEM_HW_FLAG after the DMA engine is stopped. Then,
35
    the driver can be realesd by kmem_free if ref count reaches 0.
36
    
37
    KMEM_FLAG_EXCLUSIVE - prevents multiple processes mmaping the buffer 
38
    simultaneously. This is used to prevent multiple processes use the same
81 by Suren A. Chilingaryan
Support forceful clean-up of kernel memory
39
    DMA engine at the same time. When passed to kmem_free, allows to clean
40
    buffers with lost clients even for shared buffers.
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
41
    
42
    KMEM_FLAG_REUSE - requires reuse of existing buffer. If reusable buffer is 
43
    found (non-reusable buffers, i.e. allocated without KMEM_FLAG_REUSE are
44
    ignored), it is returned instead of allocation. Three types of usage 
45
    counters are used. At moment of allocation, the HW reference is set if 
46
    neccessary. The usage counter is increased by kmem_alloc function and
47
    decreased by kmem_free. Finally, the reference is obtained at returned
48
    during mmap/munmap. So, on kmem_free, we do not clean
73 by Suren A. Chilingaryan
Implement DMA access synchronization in the driver
49
	a) buffers with reference count above zero or hardware reference set.
50
	REUSE flag should be supplied, overwise the error is returned
51
	b) PERSISTENT buffer. REUSE flash should be supplied, overwise the 
52
	error is returned
53
	c) non-exclusive buffers with usage counter above zero (For exclusive
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
54
	buffer the value of usage counter above zero just means that application
55
        have failed without cleaning buffers first. There is no easy way to 
56
        detect that for shared buffers, so it is left as manual operation in
57
        this case)
73 by Suren A. Chilingaryan
Implement DMA access synchronization in the driver
58
        d) any buffer if KMEM_FLAG_REUSE was provided to function
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
59
    During module unload, only buffers with references can prevent cleanup. In
60
    this case the only possiblity to free the driver is to call kmem_free 
61
    passing FORCE flags.
62
    
63
    KMEM_FLAG_PERSISTENT - if passed to allocation routine, changes mode of 
64
    buffer to PERSISTENT, if passed to free routine, vice-versa changes mode
65
    of buffer to NORMAL. Basically, if we call 'pci --dma-start' this flag
66
    should be passed to alloc and if we call 'pci --dma-stop' it should be
67
    passed to free. In other case, the flag should not be present.
68
69
    If application crashed, the munmap while be still called cleaning software
70
    references. However, the hardware reference will stay since it is not clear
71
    if hardware channel was closed or not. To lift hardware reference, the 
72
    application can be re-executed (or dma_stop called, for instance).
73
    * If there is no hardware reference, the buffers will be reused by next 
74
    call to application and for EXCLUSIVE buffer cleaned at the end. For SHARED
75
    buffers they will be cleaned during module cleanup only (no active 
76
    references).
77
    * The buffer will be reused by next call which can result in wrong behaviour
78
    if buffer left in incoherent stage. This should be handled on upper level.
79
    
80
 - At pcilib/kmem level synchronization of multiple buffers is performed
73 by Suren A. Chilingaryan
Implement DMA access synchronization in the driver
81
    * The HW reference and following modes should be consistent between member 
82
    parts: REUSABLE, PERSISTENT, EXCLUSIVE (only HW reference and PERSISTENT 
83
    mode should be checked, others are handled on dirver level)
84
    * It is fine if only part of buffers are reused and others are newly 
85
    allocated. However, on higher level this can be checked and resulting
86
    in failure.
87
    
88
    Treatment of inconsistencies:
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
89
     * Buffers are in PRESISTENT mode, but newly allocated, OK
90
     * Buffers are reused, but are not in PERSISTENT mode (for EXCLUSIVE buffers
91
     this means that application has crashed during the last execution), OK
92
     * Some of buffers are reused (not just REUSABLE, but actually reused), 
73 by Suren A. Chilingaryan
Implement DMA access synchronization in the driver
93
     others - not, OK until 
94
        a) either PERSISTENT flag is set or reused buffers are non-PERSISTENT
95
	b) either HW flag is set or reused buffers does not hold HW reference
96
     * PERSISTENT mode inconsistency, FAIL (even if we are going to set 
97
     PERSISTENT mode anyway)
98
     * HW reference inconsistency, FAIL (even if we are going to set 
99
     HW flag anyway)
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
100
     
101
    On allocation error at some of the buffer, call clean routine and
74 by Suren A. Chilingaryan
Implement DMA access synchronization for NWL implementation
102
     * Preserve PERSISTENT mode and HW reference if buffers held them before
103
     unsuccessful kmem initialization. Until the last failed block, the blocks
104
     of kmem should be consistent. The HW/PERSISTENT flags should be removed
105
     if all reused blocks were in HW/PERSISTENT mode. The last block needs
106
     special treatment. The flags may be removed for the block if it was
107
     HW/PERSISTENT state (and others not).
108
     * Remove REUSE flag, we want to clean if allowed by current buffer status
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
109
     * EXCLUSIVE flag is not important for kmem_free routine.
110
    
111
 - At DMA level
112
    There is 4 components of DMA access:
113
    * DMA engine enabled/disabled
114
    * DMA engine IRQs enabled/disabled - always enabled at startup
115
    * Memory buffers
116
    * Ring start/stop pointers
117
    
118
    To prevent multiple processes accessing DMA engine in parallel, the first
74 by Suren A. Chilingaryan
Implement DMA access synchronization for NWL implementation
119
    action is buffer initialization which will fail if buffers already used
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
120
	* Always with REUSE, EXCLUSIVE, and HW flags 
121
	* Optionally with PERSISTENT flag (if DMA_PERSISTENT flag is set)
122
    If another DMA app is running, the buffer allocation will fail (no dma_stop 
123
    is executed in this case) 
124
125
    Depending on PRESERVE flag, kmem_free will be called with REUSE flag 
126
    keeping buffer in memory (this is redundant since HW flag is enough) or HW
127
    flag indicating that DMA engine is stopped and buffer could be cleaned.
128
    PERSISTENT flag is defined by DMA_PERSISTENT flag passed to stop routine.
129
    
130
    PRESERVE flag is enforced if DMA_PERSISTENT is not passed to dma_stop
131
    routine and either it:
132
	a) Explicitely set by DMA_PERMANENT flag passed to dma_start 
133
	function 
134
	b) Implicitely set if DMA engine is already enabled during dma_start, 
135
	all buffers are reused, and are in persistent mode.
136
    If PRESERVE flag is on, the engine will not be stopped at the end of
137
    execution (and buffers will stay because of HW flag).
138
    
74 by Suren A. Chilingaryan
Implement DMA access synchronization for NWL implementation
139
    If buffers are reused and are already in PERSISTENT mode, DMA engine was on 
140
    before dma_start (PRESERVE flag is ignored, because it can be enforced), 
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
141
    ring pointers are calculated from LAST_BD and states of ring elements.
142
    If previous application crashed (i.e. buffers may be corrupted). Two
143
    cases are possible:
144
    * If during the call buffers were in non-PERSISTENT mode, it can be 
145
    easily detected - buffers are reused, but are not in PERSISTENT mode 
146
    (or at least was not before we set them to). In this case we just 
147
    reinitialize all buffers.
148
    * If during the call buffers were in PERSISTENT mode, it is up to 
149
    user to check their consistency and restart DMA engine.]
150
    
151
    IRQs are enabled and disabled at each call
111 by Suren A. Chilingaryan
Update scripts
152
153
DMA Reads
154
=========
155
standard: 		default reading mode, reads a single full packet
156
multipacket:		reads all available packets
157
waiting multipacket:	reads all available packets, after finishing the
158
			last one waiting if new data arrives
159
exact read:		read exactly specified number of bytes (should be
160
			only supported if it is multiple of packets, otherwise
161
			error should be returned)
162
ignore packets:		autoterminate each buffer, depends on engine 
163
			configuration
117 by Suren A. Chilingaryan
new event architecture, first trial
164
165
 To handle differnt cases, the value returned by callback function instructs
166
the DMA library how long to wait for the next data to appear before timing 
167
out. The following variants are possible:
168
terminate:		just bail out
169
check:			no timeout, just check if there is data, otherwise 
170
			terminate
171
timeout:		standard DMA timeout, normaly used while receiving
172
			fragments of packet: in this case it is expected 
173
			that device has already prepared data and only
174
			the performance of DMA engine limits transfer speed
175
wait:			wait until the data is prepared by the device, this
176
			timeout is specified as argument to the dma_stream
177
			function (standard DMA timeout is used by default)
111 by Suren A. Chilingaryan
Update scripts
178
179
			first |  new_pkt  | bufer 
180
			--------------------------	
117 by Suren A. Chilingaryan
new event architecture, first trial
181
standard		wait  | term      | timeout  
182
multiple packets	wait  | check	  | timeout 	- DMA_READ_FLAG_MULTIPACKET 	
183
waiting multipacket	wait  | wait      | timeout 	- DMA_READ_FLAG_WAIT
184
exact			wait  | wait/term | timeout	- limited by size parameter
185
ignore packets		wait  | wait/check| wait/check 	- just autoterminated
111 by Suren A. Chilingaryan
Update scripts
186
187
Shall we do a special handling in case of overflow?
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
188
    
117 by Suren A. Chilingaryan
new event architecture, first trial
189
190
Buffering
191
=========
192
 The DMA addresses are limited to 32 bits (~4GB for everything). This means we 
126 by Suren A. Chilingaryan
multithread preprocessing of ipecamera frames and code reorganization
193
 can't really use DMA pages are sole buffers. Therefore, a second thread, with
194
 a realtime scheduling policy if possible, will be spawned and will copy the 
195
 data from the DMA pages into the allocated buffers. On expiration of duration
196
 or number of events set by autostop call, this thread will be stopped but 
197
 processing in streaming mode will continue until all copyied data is passed 
198
 to the callbacks.
199
200
 To avoid stalls, the IPECamera requires data to be read continuously read out.
201
 For this reason, there is no locks in the readout thread. It will simplify
202
 overwrite the old frames if data is not copied out timely. To handle this case
203
 after getting the data and processing it, the calling application should use
204
 return_data function and check return code. This function may return error
205
 indicating that the data was overwritten meanwhile. Hence, the data is 
206
 corrupted and shoud be droped by the application. The copy_data function
207
 performs this check and user application can be sure it get coherent data
208
 in this case.
209
 
210
 There is a way to avoid this problem. For raw data, the rawdata callback
211
 can be requested. This callback blocks execution of readout thread and 
212
 data may be treated safely by calling application. However, this may 
213
 cause problems to electronics. Therefore, only memcpy should be performed
214
 on the data normally. 
215
216
 The reconstructed data, however, may be safely accessed. As described above,
217
 the raw data will be continuously overwritten by the reader thread. However,
218
 reconstructed data, upon the get_data call, will be protected by the mutex.
117 by Suren A. Chilingaryan
new event architecture, first trial
219
220
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
221
Register Access Synchronization
222
===============================
223
 We need to serialize access to the registers by the different running 
224
 applications and handle case when registers are accessed indirectly by
225
 writting PCI BARs (DMA implementations, for instance).
226
74 by Suren A. Chilingaryan
Implement DMA access synchronization for NWL implementation
227
 - Module-assisted locking:
228
 * During initialization the locking context is created (which is basicaly
229
 a kmem_handle of type LOCK_PAGE. 
230
 * This locking context is passed to the kernel module along with lock type 
231
 (LOCK_BANK) and lock item (BANK ADDRESS). If lock context is already owns
232
 lock on the specified bank, just reference number is increased, otherwise
233
 we are trying to obtain new lock.
234
 * Kernel module just iterates over all registered lock pages and checks if
235
 any holds the specified lock. if not, the lock is obtained and registered
236
 in the our lock page.
237
 * This allows to share access between multiple threads of single application
238
 (by using the same lock page) or protect (by using own lock pages by each of
239
 the threads)
240
 * Either on application cleanup or if application crashed, the memory mapping
241
 of lock page is removed and, hence, locks are freed.
242
 
243
 - Multiple-ways of accessing registers
244
 Because of reference counting, we can successfully obtain locks multiple 
245
 times if necessary. The following locks are protecting register access:
246
  a) Global register_read/write lock bank before executing implementation
247
  b) DMA bank is locked by global DMA functions. So we can access the 
248
  registers using plain PCI bar read/write.
249
  c) Sequence of register operations can be protected with pcilib_lock_bank
250
  function
251
 Reading raw register space or PCI bank is not locked.
252
  * Ok. We can detect banks which will be affected by PCI read/write and 
253
  lock them. But shall we do it?
254
 
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
255
Register/DMA Configuration
256
==========================
257
 - XML description of registers
258
 - Formal XML-based (or non XML-based) language for DMA implementation. 
259
   a) Writting/Reading register values
260
   b) Wait until <register1>=<value> on <register2>=<value> report error
261
   c) ... ?
88 by Suren A. Chilingaryan
IRQ acknowledgement support in the engine API
262
263
IRQ Handling
264
============
265
 IRQ types: DMA IRQ, Event IRQ, other types
266
 IRQ hardware source: To allow purely user-space implementation, as general
267
 rule, only a  single (standard) source should be used.
268
 IRQ source: The dma/event engines, however, may detail this hardware source
269
 and produce real IRQ source basing on the values of registers. For example, 
270
 for DMA IRQs the source may present engine number and for Event IRQs the 
271
 source may present event type.
272
273
 Only types can be enabled or disabled. The sources are enabled/disabled
274
 by enabling/disabling correspondent DMA engines or Event types. The expected
275
 workflow is following:
276
 * We enabling IRQs in user-space (normally setting some registers). Normally,
277
 just an Event IRQs, the DMA if necessary will be managed by DMA engine itself.
278
 * We waiting for standard IRQ from hardware (driver)
279
 * In the user space, we are checking registers to find out the real source
280
 of IRQ (driver reports us just hardware source), generating appropriate 
281
 events, and acknowledge IRQ. This is dependent on implementation and should 
282
 be managed inside event API.
283
 
284
 I.e. the driver implements just two methods pcilib_wait_irq(hw_source), 
285
 pcilib_clear_irq(hw_source). Only a few hardware IRQ sources are defined.
286
 In most cirstumances, the IRQ_SOURCE_DEFAULT is used. 
287
 
288
 The DMA engine may provide 3 additional methods, to enable, disable,
289
 and acknowledge IRQ.
290
 
291
 ... To be decided in details upon the need...
292
293
Updating Firmware
294
=================
90 by Suren A. Chilingaryan
Small documentation update
295
 - JTag should be connected to USB connector on the board (next to Ethernet)
88 by Suren A. Chilingaryan
IRQ acknowledgement support in the engine API
296
 - The computer should be tourned off and on before programming
90 by Suren A. Chilingaryan
Small documentation update
297
 - The environment variable should be loaded
298
    . /home/uros/.bashrc
88 by Suren A. Chilingaryan
IRQ acknowledgement support in the engine API
299
 - The application is called 'impact'
90 by Suren A. Chilingaryan
Small documentation update
300
    No project is needed, cancel initial proposals (No/Cancel)
301
    Double-click on "Boundary Scan"
302
    Right click in the right window and select "Init Chain"
303
    We don't want to select bit file now (Yes and, then, click Cancel)
304
    Right click on second (right) item and choose "Assign new CF file"
305
    Select a bit file. Answer No, we don't want to attach SPI to SPI Prom
306
    Select xv6vlx240t and program it
307
 - Shutdown and start computer
308
 
309
 Firmware are in
88 by Suren A. Chilingaryan
IRQ acknowledgement support in the engine API
310
    v.2: /home/uros/Repo/UFO2_last_good_version_UFO2.bit
311
    v.3: /home/uros/Repo/UFO3 
312
	Step5 - best working revision
313
	Step6 - last revision
314
315