1.1 QoE Content generation, delivery and presentation plays a crucial role in a collaborative system. In the Victory context, efforts are focused on the generation of solutions which can facilitate good user experience of the overall system. The multimedia content in general will be delivered under various network access and resource conditions, communication device capabilities and end-user preferences. In this context the focus is on the rendering and visualization phase (remote or local) and the search and retrieval system. 1.1.1. Visualization on client device Conceptual framework of the proposed rendering system is sketched in Figure 1. Figure 1 Conceptual framework of the rendering system (1) Once received search results, the client submit a model visualization session request to the remote server. There will be two phases: (a)user identification (authorization step) (b)resource negotiation client sends information about his device (i.e. PC desktop, laptop, PDA, smartphone), rendering preferences (frame rate expected, image resolution). The remote server has a database of the devices and their associated characteristics (resolution,color depth...), in this way the user can choose his device from a precompiled list (e.g. characteristics of almost all mobile smartphone are available at the website www.jbenchmark.com). If the device is not present in the list, a mask can be disposed (user fills the fields that describes his device) and the database can be updated with the new device. If the client device is a PC desktop/workstation or laptop, it can choose if privilege resolution or visualization frame rate. If the client device is a smartphone or a PDA the system will automatically privilege max resolution (this is due to the intrinsic characteristics of handheld devices – a very high resolution is not achievable) The remote server determine if the client is able to do rendering (a relation between number of vertices/triangles of the model and the rendering time on the client side has to be established). (2a)The model is locally downloaded on the client and rendering is done locally. (2b)If the client is not able to doing rendering, the remote server searches for a rendering server and send to it the model to be displayed. At this point, a direct connection between rendering server and client device is established. The download connection speed (bitrate) available at client side is measured and this information will be used to maximize the user perception of the quality of the rendering service suggesting an appropriate configuration of the parameters involved in bandwidth occupation. (3) still images (MJPEG) or video stream (MPEG, H.263) are computed at rendering server side and then are sent over the network to be displayed on the client device (video encoding depends on the client's characteristics) (4) User can inspect the model sending navigation commands to the rendering server. User can also send commands for re-negotiate the connection parameters. 1.1.2. Remote rendering In order to assure good user experience, multimedia contents have to be adapted to meet the limitations of the client’s terminal and network characteristics. Such multimedia adaptation could be, for instance, transcoding from one video format to another, or scaling a video in spatial domain so as to fit on the terminal’s screen. Furthermore, the system is required to provide the user with the best possible variation of a multimedia resource that the user is capable of receiving. In this sense, the concept of searching, retrieving and inspect Multipedia objects deals with the quality of the content that is delivered. The quality is treated as an end-to-end Quality of Service aggregate which we choose to view as Quality of Experience (QoE). Increasingly, this idea is evolving to include the User and the perception of the media being delivered. The resource negotiation will allow the determination and assessment of user Quality of Experience from individual quality of services specified or measured within the connection network. The key performance indicators for the remote rendering phase are: ● download bitrate (bandwidth) ● frame rate ● resolution ● compression ratio (affecting the image quality) ● colour resolution ● command latency time for 3D content manipulation ● processing speed ● power consumption ● security issue It has to be decided if data ciphering over the channel is needed in order to guarantee privacy and security issues to the client. In this case it can be used an RSA algorithm (MIRACL library in C language). Parameter negotiation policy The connection setup phase between client and rendering provider involves the negotiation of resolution and image quality. During this phase it will be negotiated the image quality (resolution, compression ratio) and the bandwidth requirements. On the client side, effective bandwidth (download bitrate) can be measured. This is the available bandwidth. Bandwidth estimation this issue could be covered using the feedback information delivered in the receiver reports of RTP. RTP offers a control protocol called RTCP that contains the necessary information for Quality of Service monitoring. In this context, if we can't use a ready solution that implements RTP (it will be searched for), we can make use of only the parts we need (i.e. the implementation of sender and receiver reports). Rendering servers generate a so called sender report. It contains information useful for intermediate synchronization as well as cumulative counters for packets and bytes sent. These allow receivers (client) to estimate the actual data rate. Transmission bandwidth computing (server side) Transmission bandwidth refers to the amount of data that will be transmitted over a period of time and is function of various parameters: BWTX = f r ∙ R ∙ I q / C where ● BWTX is the transmission bandwidth [bps] ● f r is the frame rate [frame/s] ● R is the resolution [pixel/frame] ● ● I q is the image quality (colour resolution) [bpp] C is the compression ratio (e.g. Images with 640x480 resolution, 24 bpp of colour depth (Ic), compression ratio 20:1 sent over the network at 30fps occupies a bandwidth equal to 11Mbps) If the transmission bandwidth exceeds the average bandwidth of the viewer/client's connection, the video may stutter during playback. With so many different network connection types available, we must select a target data rate based on the average download bandwidth of the client. Usually encoders adjust their compression ratio automatically based on the image resolution, frame rate, and targeted data rate settings. Mobile user in Victory framework will be able to maximize perceived Quality of Experience by reaching a balance between image quality and frame rate. (It could also be done an attempt to develop a module able to automatically adjust this parameters) Let's assume we have at our disposal i samples for the frame rate fr, j custom resolution samples for R, k predefined values for the compression ratio C and l samples for Iq. So the compounding of this 4 performance indicators have a depth equal to: d=i∙j∙k∙l We can pre-compute the transmission bandwidth needed for all the combinations of these parameters obtaining a data structure deep d similar to the Table 2 (the column parameters are addressed in the next pages) Table 1 Bit-rate computation according to rendering parameters. The goal is to maximize user perception of Quality of Experience, so capitalize the download bitrate capabilities of the user device at the current connection speed setting accordingly the transmission bitrate of the rendering server. The client could send his rendering preferences (resolution, expected frame rate, colour resolution) if they differs from the standard stored in the database (otherwise it is considered standard resolution of the device) (the client can however explicitly ask for a lower resolution); the client also sends the measured download bitrate (it will act as an initial estimation of current networking capabilities). The server “remove” all the entries in the database (the above table) that corresponds to a transmission bandwidth (bitrate) larger than the available receipt bitrate, then “remove” also all the entries that corresponds to not-suitable parameters for this connection (such as resolution greater than the requested one, colour resolution not available on client side..) and searches for the configuration (or the configurations) that best meets the rendering preferences. In general this could be done sorting (in descent order of transmission bitrate) the remaining rows of the table, weighting the parameters, and then presenting the available configuration to the client that will select the preferred choice. For example, in the above table, if measured bitrate is 150Kbps, requested resolution is 128x160 and expected frame rate is 30 fps, a list containing different suitable configuration will be sent to the client that will select a possible solution. For example, it can be chosen the solution with a compression ratio equal to 100:1, i.e. MPEG compression Typically, for a mobile device, in order to generate the possible solutions (that will be sent to the client) it will be used as sorting parameter the standard resolution for that device to the detriment of the frame rate (this means that if a particular configuration of the parameters is not possible at the current download bitrate, the parameter that will be downgraded in the presented solutions will be the frame rate). For example, in the above table if measured bitrate is 80Kbps, requested resolution is 128x160 and expected frame rate from the user is 30 fps, one of the first solutions to be presented at the client side it will be expected to be the one with frame rate equal to 16fps that occupies a bandwidth equal to 78.6Kbps. Once settled a resolution in the negotiation step (both explicitly requested by the user or retrieved by the precompiled list), it can be done a scaling down of the resolution (re-negotiating the parameters and resorting the entries of the above table) if the user has to be able to do this. Vice versa scaling up the resolution could be a problem because the client visualization application should be closed and re-opened. If the user must be able to manually tuning these parameters he can re-negotiate another visualization session. Dynamic adjustment of the transmission bandwidth Network load may change significantly during a connection. So it could be necessary to modify the transmission bandwidth of the rendering server accordingly to the client download bitrate. In order to maximize client satisfaction, the rendering server should strive to give the client the best available media quality and thus maximum use of the available bandwidth. We can make use of the feedback information delivered in the receiver reports of the protocol RTP (or implementing a similar protocol). This feedback information allows the source to estimate the loss rates experienced by the receivers and to adjust its bandwidth accordingly [Error! Reference source not found.]. The feedback control scheme is shown in Figure 2. Figure 2 The feedback control scheme The rendering server generate a sender report (SR) (it contains information useful for intermediate synchronization as well as cumulative counters for packets and bytes sent. These allow receivers to estimate the actual data rate). Sender report are needed because they carry information about packets sent by the rendering server, so that a client could be able to estimate parameters like data loss. On receiving a receiver report (RR), the rendering server performs the following steps : (1)RTCP analysis. The receiver reports are analysed and statistics of packet loss, packet delay jitter and roundtrip time are computed. (2)Network state estimation. The actual network congestion state seen by the client is determined as unloaded, loaded or congested. This is used to decide whether to increase, hold or decrease the bandwidth requirements of the sender. (3)Bandwidth adjustment. The transmission bandwidth of the rendering server is adjusted according to the decision of the network state analysis. The user can set the range of adjustable bandwidth, i.e., specify the minimum and maximum bandwidth. (1)RTCP analysis The source provides a record for the receiver containing the most recent receiver reports, the information of the session description packets, the loss rate and packet delay jitter. The loss rate is used as congestion indicator. It can be used a filter to smooth the statistics and to avoid QoS oscillations. The smoothed loss rate λ is computed by the low-pass filter λ = (1- α) λ+ αb, where b is the new value and 0< α<1. Increasing α increases the influence of the new value while decreasing α results in a higher influence of the previous values. (2)Network state estimation As a measure of congestion we can use the smoothed value of the loss rates observed by the receivers. The network congestion state is determined and used to make the decision of increasing, holding or decreasing the bandwidth. We can define two thresholds to determine the network state seen by each receiver as UNLOADED, LOADED or CONGESTED: (3)Bandwidth adjustment In the case of congestion the application running on the server side should rapidly reduce its bandwidth. Therefore we can use a multiplicative decrease µ if we get a DECREASE message and an additive increase ν if the decision is INCREASE. If the decision is HOLD no changes take place. We make sure that the bandwidth is always larger than a minimum bandwidth bmin to guarantee a minimal quality of service at the receivers. A maximum bandwidth bmax can also be set. The bandwidth adjustment algorithm is: if client is CONGESTED --> d = DECREASE then ba = max{ µ br , bmin } else if client is LOADED --> d = HOLD else --> d = INCREASE then ba = min{ br + ν, bmax } where br is the reported bandwidth and ba the allowed bandwidth. The former is the actual bandwidth as reported in the most recent RTCP sender report, the latter is the allowed bandwidth that can be used by the multimedia application. In this way the parameter configuration (frame rate, compression, ...) can be modified. Frame rate Frame rate is expressed as frames per second (fps). At a frame rate of about 15 fps and above, the viewer perceives full motion video. This rate allows for smooth playback over the internet. Setting the frame rate below 15 fps may make video playback appear slow and sluggish, but if the application is intended for viewing 3d models a frame rate equal to 10 fps could be considered. Since human eye is not so susceptible to distinguish a difference of about 1 fps in motion video, in the development of the collection of pre-configured parameters (the above table) we could consider frame rates starting from 10fps with steps of 5 fps (10fps, 15fps, 20fps and so on). Resolution Because resolution is a function of height × width, doubling resolution quadruples data rate. The most common resolution for a high-speed internet connection is 320 × 240 pixels while slower connections are typically 240 × 180 pixels or 176 × 144 pixels. Typical resolution of almost any mobile phone can be retrieved at www.jbenchmark.com (e.g. 128×160, 176×220, 240×320, 352×416, 640×376). The resolution parameter addresses another issue: aspect ratio (width/height ratio) must be maintained if the resolution is scaled down. Image quality Colour resolution. Typically values are 24 bit or 16 bit. Compression Ratio Compression techniques reduce the amount of data while preserving the best possible visual image. Based on human visual perception studies, compression techniques trick us by eliminating information our eyes are less sensitive to. Too much compression, however, may remove too much information, resulting in the creation of visible and unacceptable artefacts in the image. The best compressor maintains high image quality even at a high compression ratio. For example, without compression it would be impossible to maintain smooth playback of a noncompressed 320 × 240 video clip with a frame rate of 15fps over a LAN connection. The uncompressed clip requires 27Mbps = (320 pixels × 240 pixels × 24 bit/pixel)/frame × 15 frames/sec, but because a LAN connection only sustains bandwidths between 500 and 700 Kbps, the clip must be compressed. Generally, with MPEG compression ratios of 100:1 are common, with good image quality. Motion JPEG (MJPEG) provides ratios ranging from 15:1 to 80:1, although 20:1 or at least 25:1 is about the maximum for maintaining a good quality image. Choose compression method depends on device capabilities (pc desktop, laptop, PDA or smartphone) and could be different for each device. So various compression scheme have to be considered (JPEG, JPEG2000, MJPEG, MPEG, H.263) (e.g. for a smartphone the decoding time for a standard could be problematic and introduce a bottleneck in comparison to another compression scheme). The following table shows an example of video quality (compression ratio) obtainable over a wide range of connection speed: Table 2 Example of video quality (compression ratio) obtainable over a wide range of connection speed. Overview of compression standards JPEG JPEG compression can be done at different user-defined compression levels, which determine how much an image is to be compressed. The compression level selected is directly related to the image quality requested. Besides the compression level, the image itself also has an impact on the resulting compression ratio. JPEG2000 Another still image compression standard is JPEG2000, which was developed by the same group that also developed JPEG. Its main target is for use in medical applications and for still image photography. At low compression ratios, it performs similar to JPEG but at really high compression ratios it performs slightly better than JPEG. Motion JPEG Motion JPEG offers video as a sequence of JPEG images. Motion JPEG is the most commonly used standard in network video systems. The rendering server can compress, for example, 30 such individual images per second, and then make them available as a continuous flow of images over a network to a viewing station. As each individual image is a complete JPEG compressed image, they all have the same guaranteed quality, determined by the compression level chosen for the video server. H.263 The H.263 compression technique targets a fixed bit rate video transmission. The downside of having a fixed bit rate is that when an object moves, the quality of the image decreases. H.263 was originally designed for video conferencing applications and not for surveillance where details are more crucial than fixed bit rate. The image of a moving object will become like a mosaic if Hseries compression is used. The normally uninteresting background will, however, retain its good and clear image quality. MPEG MPEG's basic principle is to compare two compressed images to be transmitted over the network. The first compressed image is used as a reference frame, and only parts of the following images that differ from the reference image are sent. The network viewing station then reconstructs all images based on the reference image and the "difference data". Despite higher complexity, applying MPEG video compression leads to lower data volumes being transmitted across the network than is the case with Motion JPEG. MPEG4 The two groups behind H.263 and MPEG-4 joined together to form the next generation video compression standard: AVC for Advanced Video Coding, also called H.264 or MPEG-4 Part 10. The intent is to achieve very high data compression. This standard would be capable of providing good video quality at bit rates that are substantially lower than what previous standards would need, and to do so without so much of an increase in complexity as to make the design impractical or expensive to implement. Some considerations about quality supplied by compression standards The key consideration we can do is to select a video compression standard that ensures high image quality, such as Motion JPEG or MPEG-4. The system could also be conceived so that, when user send rototranslation commands to the server, the compression module increases the compression ratio resulting in a smoother video sequence (when the 3D model is inspected as a “still picture” it is more important to guarantee the image quality, while when a model is inspected sending rototranslation commands it is more important to guarantee the smoothness of the video sequence improving the frame rate). Due to its simplicity, the widely used Motion JPEG is often a good choice. There is limited delay between image rendering, encoding, transfer over the network, decoding, and finally display at the viewing station. In other words, Motion JPEG provides low latency due to its simplicity (image compression and complete individual images). Any practical image resolution, from mobile phone display size (QVGA) up to full video (4CIF) image size and above (megapixel), is available in Motion JPEG. The system guarantees image quality regardless of movement or image complexity, while offering the flexibility to select either high image quality (low compression) or lower image quality (high compression) with the benefit of lower image file sizes, thus lower bit-rate and bandwidth usage. The frame rate can easily be adjusted to limit bandwidth usage, without loss of image quality. However, Motion JPEG generates a relatively large volume of image data to be sent across the network. In this respect, MPEG has the advantage of sending a lower bitrate across the network compared with Motion JPEG, except at low frame rates. In conclusion, if the available network bandwidth is limited, or if video stream is to be received at a high frame rate, MPEG may be the preferred option. It provides a relatively high image quality at a lower bitrate (bandwidth usage). Still, the lower bandwidth demands come at the cost of higher complexity in encoding and decoding, which in turn contributes to a higher latency when compared with Motion JPEG. The video compression generated by the rendering server and delivered to a client will be chosen also depending on the client device (the decoding performance are different using a desktop PC or a mobile phone). Changing network conditions When network conditions change and degrade too much, the client device asks for re-negotiate the parameters. This could be automatically done when too many packets are discarded. If the bitrate downgrade, single frames of the rendering sequence reach the client device late and then are discarded. It can be done a timestamp check control over each single frame: the rendering server generates a frame and creates a custom header in which includes a temporal indication associated to this frame [timestampServer(#n)] when the client receive a frame it reads from the custom header the timestampServer associated to the received frame. It creates an own timestamp [timestampClient(#n)] associated to the current processed frame next rendered frame on the server side will include a timestampServer(#n+1) equal to: timestampServer(#n+1)=timestampServer(#n)+ 1/ fr when next rendered frame (#n+1) reaches the client device and is processed, again it will be associated a timestampClient(#n+1) to this frame and it will be computed the difference: δc = timestampClient(#n+1 )- timestampClient(#n) the client is able to know the instantaneous server generation frame rate (fr) by computing the difference δs = timestampServer(#n+1)-timestampServer(#n) if δc >2 δs the frame is too old so when a frame is old it is discarded and a re-negotiation policy could be established (typically the generated frame rate from the server side will be reduced) Rendering server overload If too many users attempt to connect to the rendering server, rendering performance could suddenly downgrade. It has to be kept in mind the server capabilities and it is needed to value if too many client visualization requests towards a single rendering server could compromise the initial quality negotiated with each client. In this case could be chosen a policy that limits the number of clients accessing rendering resources or decide if accept a higher number of users to the detriment of a lower quality (downgrade of the frame rate). generated frame rate on the rendering server is monitored. When it downgrades below a fixed threshold subsequently a new user connection, last client request is refused and the client is notified with the message “Too many users connected”. Adaptive modification of the quality parameters It could be done an attempt to automatically adjust the parameters involved in the visualization phase (frame rate, resolution, compression ratio). The idea concerns about the monitoring of the user perception of video quality using particular metrics (e.g. metrics designed for measuring block-edge impairments in a video frame at the receiver end, or metric that evaluates the quality of the reconstructed video frame in the event of packet loss..) There exist metrics that involve low computational complexity and are feasible for real-time monitoring of streaming video in a multimedia communication scenario. These metrics could serve as feedback parameters to dynamically adapt the video rates based on network congestion. Visualization session quality feedback When the visualization session is terminated, a feedback concerning the quality of service associated with a particular set of parameters is requested and this information could be interrelated with the characteristics of the client device.