AssistantExtensions
SBSettings toggles | Tweets | Chatbot | and more for Siri

Introduction

In this tutorial I am going to show you how to create a simple Siri Extension (see image).


Here you can download a complete "Hello snippet" example ready for compilation. The code is self-explanatory but for those interested in more details, just continue reading for a complete walkthrough.

If you like theos, ready-to-use nic templates are here. Place them to $THEOS/templates/iphone, create a new project using nic.pl, enter "make package" and your first Siri extension .deb is made! This is preferred way of developing extensions. But this zip file contains an example of how to load nibs (UI made by Interface Builder which is more convenient way to create UIs for snippets).

Update: iOSOpenDev contains AE templates as well. Alternativelly, there are my experimental templates for Xcode [mirror] (extract to ~/Library/Developer/Xcode/Templates and restart Xcode).


Requirements

First, you need to have the SiriObjects.h header in the working directory for your Siri Extension. This is the only thing you need, there are no libraries, frameworks, nothing like that. You can use Xcode or simply the "make" command for building the extension. There is a Makefile and an Xcode project provided with the tutorial Extension, modify these to suit your needs.

Let's begin

A Siri Extension is a so-called "bundle", which is in fact a directory containing a binary "image", a property list (plist) describing it and it can also optionally contain a number of support files (images, nibs, whatever).

This Siri Extension contains one "command" class and one "snippet" class. Commands are used for processing input from Siri, while snippets are used for displaying information.

So let's create two Objective-C files and their corresponding header files: HelloCommands and HelloSnippet.

We also need to specify one important thing in the HelloSnippet-Info.plist. That is, the "principal class" name. The principal class is the "main" class which gets called at the beginning of your Extension's initialization. We will put K3AHelloSnippetExtension there.

Open the HelloCommands.h and HelloSnippet.h header files and import SiriObjects.h at the top:
#import "SiriObjects.h"

HelloSnippet.h

Then, we will need to declare the principal class (which we named K3AHelloSnippetExtension earlier in the plist). We will put this into the HelloSnippet.h header file.
@interface K3AHelloSnippetExtension : NSObject<SEExtension> 

-(id)initWithSystem:(id<SESystem>)system;

-(NSString*)author;
-(NSString*)name;
-(NSString*)description;
-(NSString*)website;

@end
As you can see, the principal class inherits NSObject and it must conform to the SEExtension protocol. There is one method which you are required to implement: initWithSystem. This initializes your extension and passes down the "system" parameter, where you register your snippets and commands.
Optionally, you can also implement methods for returning the extension's name, author, description and website URL.

Since we want to implement a snippet, we need to import the UIKit header. We will put this at the top of HelloSnippet.h:
#import <UIKit/UIKit.h>
Then, we can declare the snippet itself:
@interface K3AHelloSnippet : NSObject<SESnippet> {
    UIView* _view;
    IBOutlet UIView* _helloNib;
    IBOutlet UILabel* _helloLabel;
}

- (id)initWithProperties:(NSDictionary*)props;
- (id)view;

@end
The snippet again inherits NSObject but it must conform to the SESnippet protocol. We have a simple UIView and a label in our nib file (prepared in advance using InterfaceBuilder) so we need to create IBOutlets for these. We also need to create two methods: initWithProperties and view.

HelloSnippet.mm

We have everything ready for the snippet so let's implement it in the HelloSnippet.mm file. First, we need to import the needed headers, so we put this at the beginning of the file:
#import "HelloSnippet.h"
#import "HelloCommands.h"
#import <Foundation/Foundation.h>
Then, we can start implementing individual methods:
@implementation K3AHelloSnippet
Let's start with the view method. This one is very simple:
- (id)view
{
    return _view;
}
It simply returns _view.

We also need to implement the dealloc method, to clean up after the snippet has been dismissed.
- (void)dealloc
{
    [_view release];
	[super dealloc];
}
We only have one view, to this is very simple. Finally, we will implement initWithProperties.
- (id)initWithProperties:(NSDictionary*)props;
{
    if ( (self = [super init]) )
    {
        if (![[NSBundle bundleForClass:[self class]] loadNibNamed:@"HelloNib" owner:self options:nil])
        {
            NSLog(@"Warning! Could not load nib file.\n");
            return NO;
        }
        _view = [_helloNib retain]; 
        [_helloLabel setText:[props objectForKey:@"text"]]; // text from HelloCommands
    }
    return self;
}
First, we call [super init] and if that goes through, we will try to load the nib file from our bundle. The nib file is called "HelloNib" in our case. We can also get the bundle name automatically by invoking [NSBundle bundleForClass:[self class]].
If all went fine, we can now store the view loaded from the nib file. Then we can also set the text on the label from properties passed to us in props.

Oh, but we almost forgot one thing. We still need to implement our principal class, otherwise none of this would work! Let's do that right away.
@implementation K3AHelloSnippetExtension

// required initialization
-(id)initWithSystem:(id<SESystem>)system
{
    if ( (self = [super init]) )
    {
        [system registerCommand:[K3AHelloCommands class]];
        [system registerSnippet:[K3AHelloSnippet class]];
    }
    return self;
}
This is quite simple again. We just initialize ourselves and if all is OK, we register the Command and Snippet with system. Our snippet class is named K3AHelloSnippet and the command class is K3AHelloCommands.
You can also implement author, name, etc. methods if you wish.

HelloCommands.h

This is all fine, but where do we get the properties and how do we tell Siri to actually display this when we need? These tasks are handled in the HelloCommands file.

Our commands class, which we named K3AHelloCommands again inherits NSObjects and it must conform to the SECommand protocol. The only method which this protocol requires to be implemented is handleSpeech. So let's declare this in the header file HelloCommands.h:
@interface K3AHelloCommands : NSObject<SECommand>

-(BOOL)handleSpeech:(NSString*)text tokens:(NSArray*)tokens tokenSet:(NSSet*)tokenset context:(id<SEContext>)ctx;

@end
When the handleSpeech method gets invoked, it is passed several parameters. Text is the recognized text - as string. Tokens is the "tokenized" recognized text, i.e. split into individual words. Tokenset is a set of recognized words - just a simple list of words which were recognized. You do not know the word order when using tokenset but it is faster when you just want to scan for the presence of specific words.

HelloCommands.mm

The header file is ready, let's implement it in HelloCommands.mm:
#import "HelloCommands.h"


@implementation K3AHelloCommands

-(void)dealloc
{
	[super dealloc];
}
Again, we need to implement the dealloc method for cleaning up after we have been dismissed. We do not retain anything here so a simple call to [super dealloc] is sufficient.

Finally, we will implement the handleSpeech method:
-(BOOL)handleSpeech:(NSString*)text tokens:(NSArray*)tokens tokenSet:(NSSet*)tokenset context:(id<SEContext>)ctx
{   
    	// reacts to only one token - "test"
	if ([tokenset count] == 1 && [tokenset containsObject:@"test"])
	{
        	// properties for the snippet
        	NSDictionary* snipProps = [NSDictionary dictionaryWithObject:@"Text passed as a snippet property." forKey:@"text"];
        
        	// create an array of views
        	NSMutableArray* views = [NSMutableArray arrayWithCapacity:1];
        	[views addObject:[ctx createAssistantUtteranceView:@"Hello Snippet!!"]];
		[views addObject:[ctx createSnippet:@"K3AHelloSnippet" properties:snipProps]];

        	// send views to the assistant
        	[ctx sendAddViews:views];

		// inform the assistant that this is end of the request
		[ctx sendRequestCompleted];

		return YES; // inform the system that the command has been handled (ignore the original one from the server)
	}

	return NO;	
}

@end
We want to create a very simple extension, one that displays our snipped when we say "test". First, we check if there is only one token in the set - and if true, check whether it is the word "test". If these conditions are met, we can tell Siri to display our snippet and to say something. There are two ways of doing this - but since we want to do two things at once, it is better to first prepare them and then send them both in one command.

So we create an array of views, which will hold these. Then we can add the "utterance view", which is the standard Siri speech bubble (this also tells Siri to say what we want) and also our snippet which we want Siri to display to the user.

What is context?

This is all fine but why are these methods called on ctx ("context")? This "context" is a series of questions/responses between the user and Siri. You can think of it as the topic of the conversation. Until the sendRequestCompleted method is called, all Siri actions are linked with the current context. For example, when you tell Siri to set the timer, she replies "For how long?" and then you specify the duration. Once Siri is satisfied (or she gives up), she will end the context. Without this feature, she would not know that you are still referring to the timer when you tell her its duration.

So if you want to create complicated interaction with Siri, with many questions and answers, you can do that even in another thread - just make sure the context stays the same and call sendRequestCompleted when you are done. Otherwise Siri would think that you still want to continue.

Finishing

Let's get back to our simple command. After creating the views in the array, we can send the whole array to Siri using sendAddViews and tell her that we are done with this by calling sendRequestCompleted.

We are almost done. The only question that remains is - whether to return YES or NO. If we were able to handle the request - we return YES. Otherwise, we return NO.

You should not return YES unless you are quite sure that the user wanted your extension to respond! Because when you return YES, the original action and response from Siri gets thrown away. Our simple extension is fairly safe: it returns YES only when the user says the word "test" exclusively (though it can be repeated multiple times). Siri does not handle saying "test" or "test test", etc..

How to "run"

You can't "play" or "run" the extension from within xcode directly. After you build it (mac+B shortcut or Project-Build) it will build a bundle with binary inside probably in DerivedData directory or a project root directory (this can be set up in global xcode settings in Locations tab). There is also a possibility to create custom build steps in xcode where you can e.g. include execution of your custom script which will use ssh to delete old bundle on the device, copy a newly-built one over and restart SpringBoard. This can greatly improve your iteration time (I am using this for settings bundles development as well and it's cool - kills Settings, deletes old bundle, copies now, opens settings remotely, scroll down, test, again) - but please don't ask me for instructions as it's individual and my time is limited. Google, learn, try and have fun. Here is a tool to open apps on the device. Also don't forget to sign the binary inside the bundle by ldid -S binary command.

Notes

API you just learned is for 1.0.1 AE version. All extensions will be backwards compatible - e.g. AE v9.x should be able to run old 1.0.1 extensions. I am already working on extending API to allow more cool things (different languages, matching sentence patterns, regular expressions, maybe even matching sentence elements like nouns/verbs/prepositions). Stay tuned, it's gonna be interesting! :)

Also note that the extension will be loaded into SpringBoard which means you need to prefix all names of your classes so that those classes do not collide with other extensions and tweaks loaded into SpringBoard. For example "K3AYoutubeCommands" is a good class name, while "Commads" is a really bad one. When there are two classes with the same name, one of those classes will be used and it's unpredictable which one it will be!!


Copyright © 2012 K3A.me.
Logo by @sebhel19 (sebastianhelms.de).